Social behavior hypothesis testing

ABSTRACT

A relational event history is determined based on a data set, the relational event history including a set of relational events that occurred in time among a set of actors. Data is populated in a probability model based on the relational event history, where the probability model is formulated as a series of conditional probabilities that correspond to a set of sequential decisions by an actor for each relational event, and where the probability model includes one or more statistical parameters and corresponding statistics. A baseline communications behavior for the relational event history is determined based on the populated probability model, and departures from the baseline communications behavior within the relational event history are determined. Determining departures includes determining and testing a hypothesis regarding communications behavior within the relational event history.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application claims priority from U.S. ProvisionalApplication No. 61/771,611, filed Mar. 1, 2013, U.S. ProvisionalApplication No. 61/771,625, filed Mar. 1, 2013, and U.S. ProvisionalApplication No. 61/803,876, filed Mar. 21, 2013. All of these priorapplications are incorporated by reference in their entirety.

FIELD

The present application relates to statistically modeling socialbehavior.

BACKGROUND

Descriptive statistics may be used to quantitatively describe acollection of data gathered from several sources, in order summarizefeatures of the data, and electronic communications generate data trailsthat can be used to formulate statistics. Email records, for example,may be parsed to identify a sender of an email, a topic or subject ofthe email, and one or more recipients of the email. Similarly, phonerecords provide information that may be used to identify a caller and arecipient of a phone call.

Descriptive statistics are not developed on the basis of probabilitytheory and, as such, are not used to draw conclusions from data arisingfrom systems affected by random variation. Inferential statistics, onthe other hand, are based on probability theory and are used to drawconclusions from data that is subject to random variation.

Social behavior, such as communications behavior, may involveuncertainty. In this case, inferential statistics, either alone or incombination with descriptive statistics, may therefore be applied todata reflecting the social behavior to draw inferences about suchbehavior.

SUMMARY

According to one general implementation, a relational event history isdetermined based on a data set, the relational event history including aset of relational events that occurred in time among a set of actors.Data is populated in a probability model based on the relational eventhistory, where the probability model is formulated as a series ofconditional probabilities that correspond to a set of sequentialdecisions by an actor for each relational event, and where theprobability model includes one or more statistical parameters andcorresponding statistics. A baseline communications behavior for therelational event history is determined based on the populatedprobability model, wherein the baseline comprises a first set of valuesfor the one or more statistical parameters, and departures from thebaseline communications behavior within the relational event history aredetermined. Determining departures from the baseline communicationsbehavior within the relational event history includes selecting a subsetof relational events included within the relational event history,determining a second set of values for the statistical parameters basedon the subset of relational events, determining a hypothesis regardingcommunications behavior within the relational event history, and testingthe hypothesis using the second set of values.

In one aspect, the one or more statistics relate to one or more ofsenders of relational events, modes of relational events, topics ofrelational events, or recipients of relational events.

In another aspect, determining the hypothesis regarding communicationsbehavior within the relational event history may include selecting a setof predictions regarding temporal differencing of communicationsbehavior with respect to one or more of the statistical parameters overa period of time. The set of predictions regarding temporal differencingmay include predictions including a value of a statistical parameter, avelocity of a statistical parameter, and an acceleration of astatistical parameter. The hypothesis may relate to one or more ofsenders of relational events, modes of relational events, topics ofrelational events, or recipients of relational events.

In a further aspect, testing the hypothesis using the second set ofvalues may include computing a value of a test statistic based on thesecond set of values and using the test statistic to determinedepartures from the baseline communications behavior. Using the teststatistic to determine departures from the baseline communicationsbehavior may include comparing the value of the test statistic to thehypothesis.

In another aspect, each value in the second set of values may correspondto a particular relational event included in the subset of relationalevents.

Other implementations of these aspects include corresponding systems,apparatus, and computer programs, configured to perform the describedtechniques, encoded on computer storage devices.

The details of one or more implementations of the subject matterdescribed in this specification are set forth in the accompanyingdrawings and the description below. Other potential features, aspects,and advantages of the subject matter will become apparent from thedescription, the drawings, and the claims.

DESCRIPTION OF DRAWINGS

FIG. 1 is a diagram of an example of a system that analyzes a relationalevent history.

FIG. 2 is a flowchart of an example of a process for analyzing arelational event history.

FIG. 3 is a flowchart of an example of a process for determining abaseline communications behavior for a relational event history.

FIG. 4 is a flowchart of another example of a process for determining abaseline communications behavior for a relational event history.

FIG. 5 is a flowchart of an example of a process for modeling a set ofsequential decisions made by a sender of a communication in determiningrecipients of the communication.

FIG. 6 is a flowchart of an example of a process for determining astatistic using a decay function.

FIG. 7 is a flowchart of an example of a process for analyzing socialbehavior.

FIGS. 8-23 provide examples of user interfaces and of datavisualizations that may be output to user.

Like reference symbols in the various drawings indicate like elements.

DETAILED DESCRIPTION

Because communications behavior is complex, and because the decisionsinvolved in one actor sending a communication to another are typicallynot predictable in advance, the drawing of intelligent inferences from aset of communications-related data can benefit from the development andfitting of a probability model (a mathematical model that isprobabilistic in nature). Once a model is developed and fit to a dataset, statistical methods can be used to detect patterns, changes, andanomalies in communications behavior, and to test related hypotheses.

One way to approach modeling of communications between actors is toformulate a probability density function as a series of conditionalprobabilities that can be represented by decisions made sequentially.More specifically, for each relational event (e.g., a communicationsevent occurring at a moment in time among a finite set of actors) in aset of relational events, the probability of the event's occurrence canbe decomposed such that it is the product of the probabilities of eachof at least four sequential decisions made by the event's initiatingactor: (1) a decision to send a communication at a point in time; (2) adecision as to a mode or channel of the communication; (3) a decision asto a topic or content of the communication; and (4) one or moredecisions as to one or more recipients of the communication. Althoughother sequences are possible (e.g., omitting consideration of the topicof the communication, or considering the recipient in advance of themode), the sequence of sender, mode, topic, and recipient(s) can beuseful, in appropriate cases, from a computational standpoint.

Due to the heterogeneity of time-related aspects of communicationsmodeling (e.g., communication patterns differ between work days andweekends), modeling time directly may be difficult. As such, aproportional hazards model, similar to the Cox model, may be employed,so that the “risk” of a specific relational event occurring is relativeto other possible relational events, which allows for the prediction ofwhich events are most likely to occur next, but not specifically when.Within the model, the decisions of an actor to send a communication, andas to the mode, topic, and recipient(s) of the communication, may dependon any relational event that occurred prior to the communication, aswell as on covariates that can take exogenous information (e.g., dataother than communications data) into account. As such, for eachsequential decision related to an event, a multinomial logitdistribution may be used to model the probabilities of the discretechoice problem presented by the decision, where the primary factorsaffecting the probability of an event's occurrence are a set ofstatistics (also referred to as effects) that describe the event inquestion.

An arbitrary number of statistics can be included in the model, and thetypes of statistics included may form the basis for the kinds ofinferences that can be drawn from the model. The statistics may be basedin social network theory, and may relate to one or more of senders,modes, topics, or recipients of relational events. A “trigger” effect,for example, relates to the tendency of an actor to communicate based onthe recency of a received communication, while a “broken record” effectrelates to an overall tendency to communicate about a topic. Thestatistics (or effects) may be a scalar value. In some cases, acoefficient (referred to as a statistical parameter or effect value) ismultiplied by these statistic values within the model. The statisticalparameter may therefore indicate the statistic's effect on eventprobabilities.

Once the statistics to include in the model have been chosen, the modelcan be populated based on a relational event history that is determinedfrom a data set. Using the populated model, a baselining process may beemployed to determine communications behavior that is “normal” for agiven relational event history. This baseline includes a set of effectvalues (statistical parameters) that correspond to the statisticsincluded in the model. This baseline and the populated model can be usedto determine departures from the normal communications behavior.

In this way, a probability model fitted to a relational event historyderived from event and other data may be used to identify patterns,changes, and anomalies within the relational event history, and to drawsophisticated inferences regarding a variety of communicationsbehaviors.

FIG. 1 is a diagram of an example of a system that analyzes a relationalevent history. The system 100 includes data sources 101 and 102, datanormalizer 103, inference engine 107, and user interface (“UI”) 108. Thedata normalizer 103 may be used to determine relational event history104 and covariate data 105, and the inference engine 107 may be used todetermine model 106, and to draw inferences from the model. Data sources101 and 102 may be connected to data normalizer 103 through a network,and normalizer 103, inference engine 107, and UI 108 may interface withother computers through the network. The data sources 101 and 102 may beimplemented using a single computer, or may instead be implemented usingtwo or more computers that interface through the network. Similarly,normalizer 103, inference engine 107, and UI 108 may implemented using asingle computer, or may instead be implemented using two or morecomputers that interface through the network.

Inference engine 107 may, for example, be implemented on a distributingcomputing platform in which various processes involved in drawinginferences from a relational event history and a set of covariates aredistributed to a plurality of computers. In such an implementation, oneor more computers associated with the inference engine 107 may access arelational event history, a set of covariates that have been formattedby normalizer 103, and a set of focal actors whose communicationsbehavior will be the subject of analysis. Each of the computers may usethe accessed data to perform a portion of the computations involved inpopulating the model 106, determining a baseline, and/or determiningdepartures from the baseline.

For example, in some implementations, the model may include a set ofevent statistics, or effects, that impact the probability of events. Inthis case, to populate the model, each of the computers may perform aportion of the computations involved in determining these eventstatistics, as well as determining the probabilities of the events.Also, for example, in some implementations determining a baseline mayinvolve determining a set of effect values, which are coefficients thatare multiplied by the event statistics in the model 106, and accordinglydescribe the impact of each effect on event probabilities. The processof determining the effect values may involve a maximum likelihoodestimation in which each of the computers perform a portion of thecomputations involved.

Data source 101 may contain data that relates to communications events.The data may have originated from one or more sources, may relate to oneor more modes of communication, and may be stored in one or moreformats. Data source 101 may, for example, store email records, phonelogs, chat logs, and/or server logs. Data source 101 may be, forexample, a digital archive of electronic communications that ismaintained by an organization, the archive containing data related tocommunications involving actors associated with the organization.

Data source 102 may contain additional data that, although not directlyrelated to communications events, may nevertheless prove valuable indrawing inferences from a relational event history. Data source 102 may,for example, store information relating to an organization and itspersonnel, such as organizational charts, staff directories, profit andloss records, and/or employment records. Data source 102 may be, forexample, a digital archive that is maintained by an organization, thearchive containing data relating to the operations of the organization.

Data normalizer 103 may be used to load, extract, parse, filter, enrich,adapt, and/or publish event data and other data accessed through one ormore data sources, such as data sources 101 and 102. Specifically, eventdata that is received from data source 101 may be normalized todetermine a relational event history 104, the normalization processresulting in clean, parsed relational data with disambiguated actorsthat maps to the categories of decisions included in a probability model106. An email, for example, may be parsed to reveal the sender, topic,mode, recipient(s), and timestamp of the email, and that data may beused to generate a relational event that corresponds to the email inrelational event history 104.

Normalization of the data accessed from data source 101 may involveprocessing the data so as to produce a relational event history 104 inwhich, to the extent possible, each event is described using a uniformset of characteristics, regardless of the mode of communication involvedin the event and the format in which the event data is stored in datasource 101. Each event in the event history may be include a “sender”(otherwise referred to as an ego below), which is the actor responsiblefor initiating the event, a “recipient(s)” (otherwise referred to as analter below), which is the actor(s) that received the event, a modeidentifying the mode of the event, and a timestamp indicating the timethat the occurred. For example, a phone record may result in arelational event that includes the actor responsible for initiating thecall (sender), the actor that received the call (recipient), the modeidentifying the event as a phone call, and a timestamp indicating thetime that the call was made. Other characteristics used to describe anevent in relational event history 104 may include, for example, a uniqueidentifier for the event, a duration indicating a length of the event, atype indicating the manner in which data relating to the event wasformatted in data source 101, a unique identifier for a digital archivefrom which the event was accessed, and one or more fields related toenrichment attributes (e.g., additional data describing an actorsinvolved in the event, such as job titles).

Data that is received from data source 102 may be normalized by datanormalizer 103 in order to produce covariate data 105. Covariates areexogenous variables that may be involved in effects and that may beincluded in probability model 106. A public company using a probabilitymodel to analyze communication patterns might, for example, include thecompany's daily stock price as a covariate in the model, in order toshed light on the relationship between the price and communicationsbetween employees and outsiders. An individual covariate may be a globalvariable that occurred at or during a particular time for all actors ina relational event history (e.g., relating to an audit), may instead bea dyadic variable that occurred at or during a particular time for apair of actors in the model (e.g., a supervisor-staff memberrelationship that existed between two employees), or may be an actorvariable that occurred at or during a particular time for a specificactor (e.g., a salary).

As explained above, the inference engine 107 populates probability model106 based on relational event history 104 and covariate data 105, themodel having been formulated as a series of conditional probabilitiesthat correspond to sequential decisions by actors, and including one ormore statistical parameters and corresponding statistics that may relateto one or more of senders, modes, topics, or recipients of relationalevents.

Inference engine 107 may also determine a baseline communicationsbehavior for the model 106, based on the populated model Inferenceengine 107 may, for example, determine a set of effect values through amaximum likelihood process, the determined values indicating the impactof the effects on event probabilities. The inference engine 107 may alsodetermine departures from the baseline by determining a second set ofvalues for the effects based on one or more subsets of the relationalevents included in relational event history 104, and comparing thesecond set of effect values to the first set of values.

UI 108 may be used to receive inputs from a user of system 100, and maybe used to output data to the user. A user of the system may, forexample, specify data sources to be used by normalizer 103 indetermining relational history 104 and covariate data 105. The user mayalso provide inputs indicating which actors, modes, topics, covariates,and/or effects to include in model 106, as well as which subsets of therelational event history should be analyzed in determining departuresfrom the baseline. UI 108 may provide the user with outputs indicating aformulation of model 106, a determined baseline for the model 106, andone or more determinations as to whether and how the subsets ofrelational event history being analyzed differ from the baseline. UI 108may also provide the user with additional views and analyses of data soas to allow the user to draw additional inferences from the model 106.

In more detail, UI 108 may provide to the user, based on probabilitymodel 106, a determined baseline, and/or one or more determineddepartures from the baseline, textual and/or graphical analyses of datathat uncover patterns, trends, anomalies, and change in the data. UI 108may, for example, enable a user to form a hypothesis regarding the databy selecting one or more effects, covariates, and/or sets of focalactors, and curves may be displayed based on the user's hypothesis,where the curves indicate periods of time in which communicationsbehavior is normal, as well as periods of time in which communicationsbehavior is abnormal. Other visualizations of the data, such assummations of communications behavior involving particular actors,modes, and/or topics, may be provided by UI 108

Curves displayed to the user through UI 108 may correspond to an actor,to a subset of actors, and/or to all actors in the relational eventhistory. Color coding may be used to indicate periods of time in whichcommunications behavior is normal, and periods of time in whichcommunications behavior is abnormal. A user may be able to selectbetween varying levels of sensitivity through which normality andabnormality are determined, and UI 108 may dynamically update asselection of sensitivity occurs.

Multiple curves, each representing different effects or covariates, maybe displayed by UI 108 in an overlapping fashion, and the curves may benormalized and smoothed. The curves may be displayed in two dimensions,where the x axis is associated with time, and where the y axis isassociated with properties of effects and/or covariates. UI 108 maydisplay icons and/or labels to indicate effects, covariates, or actorsto which curves correspond. UI 108 may further display labels within thetwo dimensional field in which the curves are plotted, the labelsindicating exogenous events or providing other information.

UI 108 may also enable a user to zoom in to the displayed curves inorder to view a display of events at a more granular level, withinformation pertaining to the displayed events being presented throughicons that may be color-coded according to mode, and that may be sizedto indicate, for example, a word count or length of conversation. UI 108may also provide access to individual communications related to events.UI 108 may, for example, display an individual email in response to auser selection or zoom.

In response to a user selection, UI 108 may also display, for aparticular actor, behavioral activity over time. The behavioral activitymay be displayed, for example, in a polar coordinate system, where theradial coordinates of points in the plane correspond to days, and whereangular coordinates correspond to time slices within a particular day.

In more detail, each point plotted in the plane may correspond to aparticular communications event, and may be color coded to indicate amode of communication or other information. Points plotted in the planemay be located on one of several concentric circles displayed on thegraph, and the circle on which a point is located may indicate a day onwhich the communication took place, while the position of the point onthe circle may indicate a time slice (e.g., the hour) in which thecommunication took place. A user may be able to select a communicationevent that is plotted on the graph to obtain content. For example, auser may click or hover over a point that indicates an SMS message inorder to retrieve the text of the SMS message.

In some implementations of UI 108, a polar graph indicating behavioralactivity over time may be animated, enabling a user to “fly through” avisual representation of behavioral activity over particular periods oftime, where the circles and events presented for a particular period oftime correspond to the days within that period, and where the perioddisplayed dynamically shifts in response to user input.

FIG. 2 is a flowchart of an example of a process 200 for analyzing arelational event history. The process 200 may be implemented, forexample, using system 100, although other systems or configurations maybe used. In such an implementation, one or more parts of the process maybe executed by data normalizer 103 or inference engine 107, which mayinterface with other computers through a network. Data normalizer 103may retrieve data involved in the process, such as data used indetermining a relational event history or covariates, from one or morelocal or remote data sources, such as data sources 101 and 102.

Process 200 begins when data normalizer 103 accesses event and/or otherdata from data sources 101 and 102 (201). After accessing the eventand/or other data, the data normalizer 103 normalizes the accessed data,determining a relational event history 104 and covariate data 105 (203).

The process of normalizing event and other data and determining arelational event history and covariates may involve, among other things,extracting event data from accessed data, transforming the extracteddata to a state suitable for populating the probability model 106, andenriching the transformed data with additional information gathered fromother data sources. The accessed event data may, for example, include anemail sent by one actor to two others, the email relating to a topicspecified in the probability model 106. The data normalizer 103 mayparse the email to extract a time stamp, a sender, a topic, andrecipients. Extracted data may then be transformed by data normalizer103, resulting in relational data with disambiguated actors that maps tothe categories of decisions included in a probability model 106. Thedata normalizer 103 may then enrich the transformed data by, forexample, scraping one or more websites to obtain additional datarelating to the sender or recipients. The enriched data representing arelational event corresponding to the email may then be added torelational event history 104. The data normalizer 103 may performsimilar operations in the process of producing covariate data 105.

A probability model 106 may be populated by inference engine 107 basedon the relational event history 104 and covariate data 105 (205). Insome implementations, the probability model 106 is formulated as aseries of conditional probabilities that correspond to a set ofsequential decisions by an actor. For each event, the set of sequentialdecisions includes a decision to send a communication, a decision as toamode of the communication, a decision as to a topic of thecommunication, and one or more decisions as to recipients of thecommunication. A multinomial logit distribution can be used to model theprobabilities of the discrete choice problem presented by each decision,with the primary factors affecting the probability of an event occurringare a set of event statistics (also referred to as effects), which arebased in social network theory. For example, a woman's probability ofemailing her boss may be different if she has received 4 emails fromclients within the past 24 hours than if she has received none.Likewise, a man may be more or less likely to make a phone calldepending on the number of people who have called him recently.

The model's multinomial logit distribution may include a coefficientthat is multiplied by these event statistic values and thereforedescribes the statistic's effect on event probabilities. Thiscoefficient may be referred to as an effect value or a statisticalparameter. Effects may be formulated using backward looking statistics,such as time-decayed sums of recent events of certain types, or the lastevent that was received by the ego (sender) in question, which may bereferred to as participation shifts.

In more detail, a relational event history

, such as relational event history 104, may include relational eventsthat occur at moments in time among a finite set of actors. Thecomposition of this set of actors may depend on time t, and is denoted

(t). Any ordered pair of actors is called a dyad, and relational eventsare denoted as tuples, where:

-   -   the sender if the event, also called the ego, is represented by        i∈        (t),    -   the mode of the event is re presented by m∈        ,    -   the topic of the event is represented by b∈        ,    -   the recipients of the event, also called the alters, are        represented by the set of j∈        ⊂        (t), and    -   the timestamp of the event is represented by t∈[0, ∞).

For notational convenience, a number of functions are defined for anevent e:

m(e) refers to the mode of event e,

b(e) refers to the topic of event e,

(e) refers to the alters of event e, and

τ(e) refers to the timestamp of event e.

i(e) refers to the ego of event e,

A strictly ordered set of these relational events constitutes arelational event history, which may be represented through use of theshorthand notation:

ε_(t) ={e∈ε:τ(e)<t}.

The event history is an endogenous variable Other, exogenous variables(i.e., covariates) may also be included in the probability model, andthese covariates may be partitioned into three types:

-   1. global covariates that have the same value for all actors and    relationships. The global covariates are denoted by the    . A covariate W_(a)∈    is a function that maps a timestamp t to a real value W_(a)(t),-   2. actor covariates that are permitted different values for each    actor. The actor covariates are denoted by the set χ. A covariate    X_(a)∈χ is a function that maps a timestamp/actor pair (i,t) to a    real value X_(a)(i,t).-   3. dyadic covariates that are permitted different values for each    dyad. The dyadic covariates are denoted by the set γ. A. covariate    Y_(a)∈γ is a function that maps a timestamp/dyad pair (i,j,t) to a    real value Y_(a)(i,j,t).

Typically, covariates will have a time domain equal to the relationalevent history. The combination of the relational event history and thecovariates constitutes a dataset that may be denoted:

=(ε,

,χ,γ)

A probability density representing the probability of the occurrence ofeach relational event in the relational event history and based on thedataset may be denoted:

f(ε|

,χ,γ

₀,θ)

where

₀ denotes all information before t=0 and θ is a statistical parameter

The probabilities may be conditioned on all prior events andinformation, including an initial state t=0, such that the probabilitydensity may be rewritten:

f  ( ℰ   ,  ,  , 0 , θ ) = ∏ e ∈ ɛ   f  ( e  ℰ τ  ( e ) ,  , ,  , 0 , θ )

For practical reasons, it may be beneficial to restrict the conditionaldensities such that only covariate information known at time t caninfluence the probabilities of event occurrence. Accordingly, theprobability density can be simplified by requiring that:

f(e|ε _(τ(e),)

_(,χ,γ,)

₀,θ)=f(e

_(τ(e)),θ)

The probability density may be decomposed into a series of conditionalprobabilities that correspond to a set of sequential decisions by anactor:

f  ( e  |  τ  ( e ) , θ ) = f  ( i  ( e ) , τ  ( e )  |  τ  (e ) , θ ) × f  ( m  ( e )  |  i  ( e ) , τ  ( e ) , τ  ( e ) , θ) × f  ( b  ( e )  |  i  ( e ) , τ  ( e ) , m  ( e ) , τ  ( e ), θ ) × f  (   ( e )  |  i  ( e ) , τ  ( e ) , m  ( e ) , b  (e ) , τ  ( e ) , θ )

The conditional probabilities included in this version of the modelconsist of:

-   -   the joint ego/interarrival density f(i(e),τ(e)| . . . ), which        describes the process by which the ego (i.e. sender) of an event        waits some time and then decides t send an email,    -   the mode density f(m(e)| . . . ), which describes the process of        the ego choosing to communicate (e.g. by email or by phone),    -   the topic density f(b(e)| . . . ), which describes the process        of ego deciding to communicate about one of some finite set of        topics,    -   and the alter density f(        (e)| . . . ) which describes the process of the ego deciding to        communicate with some set of alters (i.e. recipients).

The ego/interarrival density, which describes a process by which an egochooses to initiate an event following a most-recent event, can bemodeled using a semi-parametric approach in the spirit of the Coxproportional hazards model:

-   -   Suppose that t_(*) is the time of the most recent event to        occur, or 0 if none yet occurred. We assume that, for all egos        i∈        (t) and for all interarrival times t∈(0,∞) until the next event,        that

f(i,t|

_(t) _(*) _(+t),θ)=λ₀(t)×exp φ_(I)(i|

_(t) _(*) ,θ)

Notably, the function φ_(I) which models the probability density thatthe next event to occur after t_(*) entails ego i sending an event aftera holding time of t, does not depend on the interarrival time t, andtherefore cannot incorporate information accumulated since the lastevent. Moreover, the baseline hazard rate λ_(o)(t) depends on neitherego i nor the interarrival time t. As such, the hazards between allcandidate egos are proportional:

f  ( i a , t  |  t * + t , θ ) f  ( i b , t  |  t * + t , θ ) =exp   φ I  ( i a  |  t * , θ ) exp   φ I  ( i b  |  t * , θ )

For convenience, a linear form for φ_(I) may be used:

φ_(I)(i|

_(t) _(*) ,θ)=α^(T) s _(I)(i,

_(t) _(*) )

-   -   where s_(I)(i,        _(t) _(*) ) is a statistic and α⊂θ.

This form allows the baseline hazard rate to be ignored in modeling, asestimation entails only a partial likelihood. Thus, the optimalparameter θ can be found by maximizing, over the relational eventhistory, the probability density that the ego for each event would sendan event, as opposed to the other actors:

P  ( i  ( e ) = i *  |  τ  ( e ) , θ ) = exp   φ I  ( i *  | τ  ( e ) , θ ) ∑ i ∈   ( τ  ( e ) )   exp   φ I  ( i  |  τ ( e ) , θ )

The fact that the function θ_(I) does not depend on the interarrivaltime t, and therefore cannot incorporate information accumulated sincethe last event, necessitates a further assumption regarding covariatevalues. Namely, that changes to covariate values occur only at momentswhen events occur. More formally:

-   -   given an event history ε, a covariate ζ, and two time points t        and t+dt≧t, if there exists no e∈ε such that τ(e)∈(t,t+dt], then        ζ(t)=ζ(t+dt).

The incorporation of the passage of time into the impact of an event onfuture events may modeled through a decay function using the followinggeneral form:

${s\left( {t,\ldots}\mspace{11mu} \right)} = {\sum\limits_{\{{e \in {\mathcal{E}:\; {\Psi_{I}({e,\; \ldots}\mspace{11mu})}}}\}}\; {\delta \left( {t,{\tau (e)}} \right)}}$

-   -   where Ψ_(I)(e, . . . ) is called a predicate, and describes        criteria by which a subset of events is selected

Effects that evaluate an actor's decision to initiate a communicationmay be referred to as “ego effects.” It possible to develop effects thatdepend on either or both of the relational event history and covariates.Ego effects that depend on the relational event history may include theactivity ego selection effect and the popularity ego selection effect,both of which quantify the amount of communications that have recentlybeen sent to (popularity) or sent by (activity) the candidate ego i*,where recency is determined by specifying the effects throughpredicates:

the activity ego selection effect

Ψ_(I)(e,i*)={i*=i(e)}

the popularity ego selection effect

Ψ_(I)(e,i*)={i*∈

(e)}

Ego effects that depend on covariates may include the actor covariateego selection effect and the indegree dyadic covariate ego selectioneffect:

-   -   the actor covariate ego selection effect for actor covariate        x_(a)

s _(I)(i*,

_(t))=x _(a)(i*),

-   -   and the indegree dyadic covariate go selection effect for dyadic        covariate y_(a)

s I  ( i * , t ) = ∑ j ∈  i   y a  ( j , i * ) .

The mode density, which describes the process of the ego choosing one ofa finite set of modes, can be represented through a conditionaldistribution:

f(m(e)|i(e),τ(e),

_(τ)(e),θ)

A model for the selection of a mode by the ego is the multinomial logit,or discrete choice distribution:

${f\left( {{m(e)} = {m^{*}\text{|}\ldots}}\mspace{11mu} \right)} = \frac{\exp \; {\varphi_{M}\left( {m^{*},\ldots}\; \right)}}{\sum\limits_{m \in \mathcal{M}}\; {\exp \; {\varphi_{M}\left( {m,\ldots}\; \right)}}}$

For convenience, a linear form for φ_(M) may be used:

φ_(M)(m*, . . . )=β^(T) s _(M)(m*, . . . )

-   -   where S_(M)(m*, . . . ) is a statistic and β⊂θ

Effects that evaluate the ego's decision as to the method ofcommunication may be referred to as “mode effects.” Mode effects mayinclude the relay mode selection effect, which quantifies a selected egoi's tendency to contact other actors through a candidate mode m* thatwas recently used by other actors to contact i, and the habit modeselection effect, which quantifies a selected ego i's tendency tocontact other actors through a candidate mode m* that i recently used tocontact other actors:

the relay mode selection effect

Ψ_(M)(e,m*,i)={m*=m(e),i∈

(e)}

the habit mode selection effect

Ψ_(M)(e,m*,i)={m*=m(e),i=i(e)}.

The topic density, which describes the process of the ego choosing oneof a finite set of topics, given a previous choice of mode, can berepresented through a conditional distribution:

f(t(e)|m(e),i(e),τ(e),

_(τ(e)),θ)

As with selection of a mode, the selection of a topic by the ego can bemodeled using a multinomial logit, or discrete choice distribution, theselected mode being introduced into the topic distribution as acovariate:

${f\left( {{b(e)} = {b^{*}\text{|}\ldots}}\mspace{11mu} \right)} = \frac{\exp \; {\varphi_{B}\left( {b^{*},\ldots}\mspace{11mu} \right)}}{\sum\limits_{b \in \mathcal{B}_{({\tau {(e)}})}}\; {\exp \; {\varphi_{B}\left( {b,\ldots}\; \right)}}}$

For convenience, a linear form for φ_(B) may be used:

φ_(B)(b, . . . )=γ^(T) s _(B)(b, . . . )

-   -   where s_(B)(b, . . . ) is a statistic and γ⊂θ

Effects that evaluate the ego's decision as to the topic ofcommunication may be referred to as “topic effects.” Topic effects mayinclude the relay topic selection effect, which quantifies a selectedego i's tendency to communicate with other actors about a candidatetopic b* that recently appeared in communications sent to i by otheractors, and the habit mode selection effect, which quantifies a selectedego i's tendency to communicate with other actors about a candidatetopic b* that recently appeared in communications sent by i to otheractors:

the topic selection effect

Ψ_(B)(e,b*, . . . )={b*=b(e),m=m(e),i=j(e)}

the habit topic selection effect

Ψ_(B)(e,b*, . . . )={b*=b(e),m(e),i=,i(e)}

The alter density, which describes the process of the ego choosing tocommunicate with a set of alters, given previous choices of mode andtopic, can be represented through a conditional distribution:

f(

(e)|b(e),m(e),i(e),τ(e),

_(τ(e)),θ)

Notably, the size of the outcome space of possible alter sets

(e) can be enormous, even for a small

. The size of this set |

(e)| is stochastic and lies on the interval [1,

_(τ(e))). For each size |

(e)| on this interval, we have:

$\quad\begin{pmatrix}{{_{\tau {(e)}}} - 1} \\{{(e)}}\end{pmatrix}$

As such, the dimensionality of the outcome space is:

$\sum\limits_{a = 1}^{_{\tau {(e)}} - 1}\; {\quad\begin{pmatrix}{{_{\tau {(e)}}} - 1} \\a\end{pmatrix}}$

To avoid practical problems associated with the enormity of the outcomespace of possible alter sets,

may be modeled as an ordered set, such that the ego adds actors into thealter set alter-by-alter:

(e) = {j₁, j₂, …  , j_((e))}${_{*}(a)} = \left\{ \begin{matrix}\varnothing & {a = 1} \\\left\{ {j_{1},j_{2},{\ldots \text{|}},j_{a - 1}} \right\} & {a > 1}\end{matrix} \right.$

Modeled in this way, the density may be decomposed:

${f\left( {{(e)}\text{|}\ldots}\; \right)} = {\prod\limits_{a = 1}^{{{(e)}}}\; {f\left( {{j_{a}\text{|}{_{*}(a)}},\ldots}\; \right)}}$

Each element in the decomposition of the density may be modeled using amultinomial logit, or discrete choice distribution:

${f\left( {{j^{*}\text{|}{_{*}(a)}},\ldots}\; \right)} = \frac{\exp \; {\varphi_{J}\left( {j^{*},{_{*}(a)},\ldots}\; \right)}}{\sum\limits_{j \in {{({{i{(e)}},{_{*}{(a)}}})}}}\; {\exp \; {\varphi_{J}\left( {j,{_{*}(a)},\ldots}\; \right)}}}$

In the modeling of the elements, the alter risk set

(a) defines the choices of alter available to an ego for the alterselection decision at hand. One way of characterizing the alter risk setis:

${\left( {i,{_{*}(a)}} \right)} = \left\{ \begin{matrix}{_{\tau {(e)}}\backslash i} & {{_{*}(a)} = \varnothing} \\{_{\tau {(e)}}\backslash {_{*}(a)}} & {else}\end{matrix} \right.$

-   -   where we denote the choice of j*=i(e) as the choice to not add        any more alters into the set        (e).

Decomposing the model for alter selection in this way makes it possibleto specify φ_(J) in a linear form:

φ_(J)(j,

_(*)(a), . . . )=η^(T) s _(J)(j,

_(*)(a), . . . )

-   -   where s_(J)(j,        _(*)(a), . . . ) is a statistic and η⊂θ

Effects that evaluate the ego's decision(s) as to the recipient(s) of acommunication may be referred to as “alter effects.” It possible todevelop alter effects that depend on either or both of the relationalevent history and covariates. Alter effects that depend on therelational event history may include the fellowship alter effect and therepetition alter effect:

the Fellowship Alter Effect

Ψ_(J)(e,j*,

_(*)(a), . . . )={i∪j*∪

_(*)(a)⊂ i(e)∪

(e)}

and the Repetition Alter Effect

Ψ_(J)(e,j*,

_(*)(a), . . . )={i=i(e),j*∈

(e)}

Alter effects that do not depend on the relational event history mayinclude the activity alter effect and the actor covariate alter effect:

the Activity Alter Effect

s _(J)(j*,

_(*)(a), . . . )=1−

{i(c)=j*}

where

{i(e)=j*} is the indicator function,

and the Actor Covariate Alter Effect

s _(J)(j*,

_(*)(a), . . . )=(1−

{i(e)=j*})X{j*,τ(e)}

-   -   where X{j*,τ(e)} is the value of the actor covariate X for actor        j* at time τ(e).

In some situations, it may be advantageous to restrict the model to aset of focal actors

(t), a subset of actors in the relational event history:

ε_(F) ={e∈ε:i(e)∈

(τ(e))}

These situations can arise, for example, when there is a need foranalysis of communications involving only a subset of actors in therelational event history, or when the relational event history includescomplete data for one subset of actors but not for another. In thesesituations, instead of drawing inference using the joint probability ofthe entire event history, the joint probability of all events sent bythe focal actors may be modeled. The joint probability density of allevents sent by focal actors may be written:

f  ( ℰ F  |  ℰ  \  ℰ F ,  ,  ,  , 0 , θ ) = ∏ e ∈ ℱ  ( τ  ( e) )   f  ( e  |  ℰ τ  ( e ) ,  ,  ,  , 0 , θ )

In this distribution, all events that are initiated by non-focal actorsare regarded as exogenous.

Following the population of the probability model 106 based on therelational event history 104 and covariate data 105, a baseline ofcommunications behavior is determined (207). For example, in a situationin which the model described with respect to operation 205 is used, abaseline may be determined by estimating the effect values (statisticalparameters). The estimated effect values represent the baselinecommunications behavior because they indicate the impact of thecorresponding effect (statistic) on the event probabilities.

The effect values may be estimated, for instance, through maximumlikelihood methods. The Newton-Raphson algorithm, for example, may beused. The Newton-Raphson algorithm is a method of determining a maximumlikelihood through iterative optimization of parameter values that,given a current set of estimates, uses the first and second derivativesof the log likelihood function about these estimates to intelligentlyselect the next, more likely set of estimates. Through theNewton-Raphson method, which may be performed, for example, using adistributed computing platform, a converged estimate of each effectvalue included in the probability model may be obtained.

Departures from the baseline of communications behavior may bedetermined (209). For instance, the baseline can be considered normalcommunications behavior, and deviations from the baseline can beconsidered abnormal, or anomalous behaviors. In models in which thebaseline is represented by effect values estimated based on therelational event history, anomalous behaviors may be detected bycomparing effect values for subsets (also referred to as pivots) of therelational event history to the baseline values in order to determinewhether, when, or where behavior deviates from the norm.

A pivot may, for example, include only events involving a specific actorduring a specific time period, only events involving a specific set oftopics, or only events involving a specific subset of actors. If, forexample, the baseline reveals that actors in the relational eventhistory typically do not communicate between midnight and 6:00 AM, aparticular actor initiating 30 events during those hours over aparticular period of time might be considered anomalous. As anotherexample, if one subset of actors typically communicates by email, butthe majority of communications between actors in the subset during aspecific time frame are phone calls, the switch in mode might beconsidered anomalous.

Effect values may be determined for a pivot using the same methodsemployed in determining the baseline, for example, through use of theNewton-Raphson method. Pivot-specific effect values may be referred toas “fixed effects.”

Comparison of the fixed effects to the baseline may allow for estimationof the degree to which the communications behavior described by aparticular pivot departs from normality. Other, more involved inferencesmay also be drawn. For example, it is possible to determine whether afixed effect has high or low values, is increasing or decreasing, or isaccelerating or decelerating. Further, it is possible to determinewhether or how these features hang together between multiple fixedeffects. Analyses of this nature may be performed through ahypothesis-testing framework that, for example, employs a score test tocompare an alternative hypothesis for one or more effect values to anull hypothesis (i.e., the baseline).

FIG. 3 is a flowchart of an example of a process for determining abaseline communications behavior for a relational event history. Theprocess 300 may be implemented, for example, using system 100. In suchan implementation, one or more parts of the process may be executed byinference engine 107, which may interface with other computers through anetwork. Inference engine 107 may, for example, send or receive datainvolved in the process to and from one or more local or remotecomputers.

Assuming that the relational event history 104 and covariate data 105contain enough information to compute probabilities of all events in thedataset, estimation of the baseline may be performed using a maximumlikelihood method.

In more detail, it is possible to estimate effect values through maximumlikelihood methods such as the Newton-Raphson method. Newton-Raphson isa method of iterative optimization of parameter values that, given acurrent set of estimates, uses the first and second derivatives of thelog likelihood function about these estimates to intelligently selectthe next, more likely set of estimates. When applied in determining abaseline for the probability model 106, the Newton-Raphson method mayuse the first and second derivatives of the log likelihood functionabout a set of estimated effect values in order to select the next, morelikely set of estimated effect values, with the goal of maximizing the“agreement” of the probability model 106 with the relational eventhistory 104 and covariate data 105 on which the model 106 is based.Convergence of a set of effect values may be considered to have occurredwhen a summation across all events of the first derivative of the loglikelihood function about the currently estimated effect values isapproximately zero, and when the summation across all events of thesecond derivative of the log likelihood function about the currentlyestimated effect values meets or exceeds a threshold of negativity.

The joint probability of the relational event history ε may be adaptedinto a log likelihood that can be maximized numerically viaNewton-Raphson:

L  ( θ  |  ) = ∑ e ∈ ℰ   log   f  ( e  |  τ  ( e ) , θ )

The method begins with an initial parameter estimate {circumflex over(θ)}₀ (301). First and second derivatives of the log likelihood withrespect to θ may be derived (303):

${V\left( {\theta } \right)} = \frac{\partial{L\left( {\theta } \right)}}{\partial\theta}$${\mathcal{I}\left( {\theta } \right)} = {- \frac{\partial^{2}{L\left( {\theta } \right)}}{\partial\theta^{2}}}$

The initial parameter estimate {circumflex over (θ)}₀ may be refinediteratively through update steps, in which first and second derivativesare evaluated at {circumflex over (θ)}_(i) for each event (305) and thensummed across all events (307) in order to arrive at a new parameterestimate {circumflex over (θ)}_(i+1) (309):

{circumflex over (θ)}_(i+1)={circumflex over (θ)}_(i) +V_({circumflex over (θ)}) _(i) ^(T)(θ|

)

_({circumflex over (θ)}) _(i) ⁻¹(θ|

)|

Update steps may continue until optimization converges on a value (311).The converged value is the maximum likelihood estimate {circumflex over(θ)}, which has an estimated covariance matrix that is the inverse ofthe second derivative matrix evaluated at {circumflex over (θ)}₀:

$\hat{\sum\limits_{\hat{\theta}}}{= {\mathcal{I}_{\hat{\theta}}^{- 1}\left( {\theta } \right)}}$

In a system 100 in which inference engine 107 is implemented on adistributed computing platform, the process of determining a baselinemay involve distribution of the model 106 to a plurality of computersassociated with inference engine 107, each of which is involved incomputing the maximum likelihood estimate {circumflex over (θ)} throughthe Newton-Raphson algorithm. Different computers may, for example,evaluate first and second derivative values for different events, withthose evaluated derivatives then being summed. In such animplementation, an individual computer to which the model is distributedmay evaluate first and second derivatives for a single event (305)during each update of Newton-Raphson, and may report the evaluations toa central computer that sums the first and second derivatives across allevents (307), and that forms a new parameter estimate based on theresult of summing the first and second derivatives (309).

As explained above, each of the conditional distributions in theprobability model 106 may be formulated as a multinomial logit. Forpurposes of deriving first and second derivatives that may be used indetermining a baseline through the Newton-Raphson method, the followingmultinomial distribution, featuring a random variable X with an outcomespace χ and an arbitrary statistic s(x), may be considered:

${f\left( {X = {x\theta}} \right)} = \frac{\exp \; \theta^{T}{s(x)}}{\sum\limits_{x_{a} \in }{\exp \; \theta^{T}{s\left( x_{a} \right)}}}$

The log likelihood of this distribution may be denoted:

$\begin{matrix}{{g\left( {X = {x\theta}} \right)} = {\log \; {f\left( {X = {x\theta}} \right)}}} \\{= {{\theta^{T}{s(x)}} - {\log {\sum\limits_{x_{a} \in }{\exp \; \theta^{T}{s\left( x_{a} \right)}}}}}}\end{matrix}$

The first derivative of g is:

$\begin{matrix}{{g^{\prime}\left( {X = {x\theta}} \right)} = \frac{\partial{g\left( {X = {x\theta}} \right)}}{\partial\theta}} \\{= {{\frac{\partial}{\partial\theta}\theta^{T}{s(x)}} - {\frac{\partial}{\partial\theta}\log {\sum\limits_{x_{a} \in }{\exp \; \theta^{T}{s\left( x_{a} \right)}}}}}} \\{= {{s(x)} - \frac{\sum\limits_{x_{a} \in }{{s\left( x_{a} \right)}\exp \; \theta^{T}{s\left( x_{a} \right)}}}{\sum\limits_{x_{a} \in }{\exp \; \theta^{T}{s\left( x_{a} \right)}}}}} \\{= {{s(x)} - {E_{\theta}\left\{ {s(X)} \right\}}}}\end{matrix}$

-   -   where E_(θ){s(X)} is the expectation of the statistic s(X) with        respect to the random variable X given the parameter θ.

Note that the expectation of a random variable y with domain Y andprobability density p(y) is calculated by:

$\sum\limits_{y \in Y}{{yp}(y)}$

In deriving the first derivative of g, note that y=s(x_(a)) and p(y) isthe multinomial distribution f(X=x|θ), which allows for the finalsimplification.

The second derivative of g is:

$\begin{matrix}{{g^{''}\left( {X = {x\theta}} \right)} = \frac{\partial^{2}{g\left( {X = {x\theta}} \right)}^{T}}{\partial\theta^{2}}} \\{= {\frac{\partial}{\partial\theta}\left( {{s(x)} - {E_{\theta}\left\{ {s(X)} \right\}}} \right)^{T}}} \\{= {{- \frac{\partial}{\partial\theta}}E_{\theta}\left\{ {s(X)} \right\}^{T}}} \\{= {- {\sum\limits_{X \in }{\frac{\partial}{\partial\theta}{f\left( {X = {x\theta}} \right)}{s(X)}^{T}}}}}\end{matrix}$

Note the relationship between f and the first derivative of g:

$\begin{matrix}{{g^{\prime}\left( {X = {x\theta}} \right)} = \frac{{\partial\log}\; {f\left( {X = {x\theta}} \right)}}{\partial\theta}} \\{= {\frac{1}{f\left( {X = {x\theta}} \right)}\frac{\partial{f\left( {X = {x\theta}} \right)}}{\partial\theta}}}\end{matrix}$

So that

${\frac{\partial}{\partial\theta}{f\left( {{X - x}\theta} \right)}} = {{f\left( {X = {x\theta}} \right)}{g^{\prime}\left( {X = {x\theta}} \right)}}$

Incorporating this relationship allows for further derivation:

$\begin{matrix}{{g^{''}\left( {X = {x\theta}} \right)} = {- {\sum\limits_{X \in }{{f\left( {X = {x\theta}} \right)}{g^{\prime}\left( {X = {x\theta}} \right)}{s(X)}^{T}}}}} \\{= {{- E_{\theta}}\left\{ {{g^{\prime}\left( {X = {x\theta}} \right)}{s(X)}^{T}} \right\}}} \\{= {{- E_{\theta}}\left\{ {\left( {{s(X)} - {E_{\theta}\left\{ {s(X)} \right\}}} \right){s(X)}^{T}} \right\}}} \\{= {{{- E_{\theta}}\left\{ {{s(X)}{s(X)}^{T}} \right\}} - {E_{\theta}\left\{ {\left\lbrack {E_{\theta}\left\{ {s(X)} \right\}} \right\rbrack {s(X)}^{T}} \right\}}}} \\{= {{{- E_{\theta}}\left\{ {{s(X)}{s(X)}^{T}} \right\}} - {E_{\theta}\left\{ {s(X)} \right\} E_{\theta}\left\{ {s(X)} \right\}^{T}}}} \\{= {{- {Cov}_{\theta}}\left\{ {s(X)} \right\}}}\end{matrix}$

where Cov_(θ){s(X)} is the covariance matrix of s(X) given θ

It can be observed that:

$\begin{matrix}{{{Cov}_{\theta}\left\{ {g^{\prime}\left( {X = {x\theta}} \right)} \right\}} = {{Cov}_{\theta}\left\{ {{s(X)} - {E_{\theta}\left\{ {s(X)} \right\}}} \right\}}} \\{= {E_{\theta}\left\{ {{s(X)} - {E_{\theta}\left\{ {s(X)} \right\}}} \right)\left( {{s(X)} - {E_{\theta}\left\{ {s(X)} \right)^{T}}} \right\}}} \\{= {E_{\theta}\left\{ {{{s(X)}{s(X)}^{T}} - {2E_{\theta}\left\{ {s(X)} \right\} {s(X)}^{T}} + {E_{\theta}\left\{ {s(X)} \right\} E_{\theta}\left\{ {s(X)} \right\}^{T}}} \right\}}} \\{= {{E_{\theta}\left\{ {{s(X)}{s(X)}^{T}} \right\}} - {E_{\theta}\left\{ {s(X)} \right\} E_{\theta}\left\{ {s(X)} \right\}^{T}}}} \\{= {{Cov}_{\theta}\left\{ {s(X)} \right\}}}\end{matrix}$

Giving:

$\begin{matrix}{{g^{''}\left( {X = {x\theta}} \right)} = {{- {Cov}_{\theta}}\left\{ {s(X)} \right\}}} \\{= {{- {Cov}_{\theta}}\left\{ {g^{\prime}\left( {X = {x\theta}} \right)} \right\}}}\end{matrix}$

Because each of the ego, mode, topic, and alter distributions aremultinomial logits, these results can be applied when determining thebaseline using the Newton-Raphson method.

FIG. 4 is a flowchart of another example of a process 400 fordetermining a baseline communications behavior for a relational eventhistory. The process 400 may be implemented, for example, using system100. In such an implementation, one or more parts of the process may beexecuted by inference engine 107, which may interface with othercomputers through a network. Inference engine 107 may, for example, sendor receive data involved in the process to and from one or more local orremote computers.

The process 400 may be used when the order in which recipients areselected by the sender is unknown. As indicated above, in the situationin which the sender of a communication may select multiple recipientsfor the communication, the sender's choice of which recipients toinclude may be best modeled as a set of sequential decisions involvingan order in which recipients are added by the sender to thecommunication, rather than a single decision. A single email, forexample, may be addressed to multiple recipients, and the sender of theemail may add each recipient in sequence.

As a result, a model, such as the one described with respect tooperation 205 above, may model the choice of recipients as a sequentialset of decisions. In this case, the model may be used without issue whenthe order of the selection of the recipients is known. However, in somecases, the recipients may be known, but not the order, which may presenta missing data problem if the model is to be employed.

More formally, when attempting to determine a baseline, it may be thecase that, for each event e, an unordered set of chosen alters

(e) is known, but

(e), denoting an ordered set of alters representing the sequence inwhich alters were chosen, is not known. In such a case, ℑ(e), denotingthe set of all possible ordered sets

(e) satisfying the requirement that the sequence of alter selections in

(e) could have yielded the observed

(e) is also not known, which presents a missing data problem.

To estimate {circumflex over (θ)} in such a situation, the expectationmaximization (EM) algorithm may be employed. The algorithm consists oftwo steps that may be executed iteratively until {circumflex over (θ)}converges. Performing EM using probability model 106 involvescalculation of expectations of quantities that depend on the missingsequences of alter choices. In general, evaluation of the expectation ofa function h(

(e);

_(τ(e)),θ) may be performed using:

E θ  { h  (   ( e ) , τ  ( e ) , θ ) } = ∑   ( e ) ∈   ( e ) f  (   ( e )    ( e ) , τ  ( e ) , θ )  h  (   ( e ) , τ  (e ) ,  θ )

where density|f(

(e)|

(e),

_(τ(e)),θ) is the probability of the alter selection sequence

(e) given that the set

(e) is ultimately chosen.

During a maximization step, a single update of the Newton-Raphsonalgorithm is conducted by summing the expected first and secondderivatives across all events:

θ_(i+1)={circumflex over (θ)}_(i) +E _({circumflex over (θ)}) _(i) {V^(T)(θ|

)}[E _({circumflex over (θ)}) _(i) {

(θ|

)}]⁻¹

New parameter estimates can be formed based on the result of summing thefirst and second derivatives, and the EM process may be repeated untilconvergence of {circumflex over (θ)}_(i) is achieved.

The most practically difficult part of the EM process is computingexpectations of the form:

E θ  { h  (   ( e ) , τ  ( e ) , θ ) } = ∑   ( e ) ∈   ( e ) f  (   ( e )    ( e ) , τ  ( e ) , θ )  h  (   ( e ) , τ  (e ) ,  θ )

Two approaches may be taken, depending on the size of ℑ(e), which isgiven by:

|ℑ(e)|=|

(e)|!

When the number of actors in the unordered set of alters is below athreshold number (e.g., of size 5 or less), analytical estimation may befeasible and therefore performed. Noting that:

f  (   ( e )    ( e ) , τ  ( e ) , θ ) = f  (   ( e ) ,   (e )  τ  ( e ) , θ ) f  (   ( e )  τ  ( e ) , θ ) and(e) ⇒ (e) f  (   ( e ) ,   ( e )  τ  ( e ) , θ ) = f  (  ( e )  τ  ( e ) , θ ) Since f  (   ( e )  τ  ( e ) , θ ) = ∑  ( e ) ∈   ( e )  f  (   ( e )  τ  ( e ) , θ )

It is possible to directly evaluate

f  (   ( e )    ( e ) , τ , ( e ) , θ ) = f  (   ( e )  τ  (e ) , θ ) f  (   ( e )  τ  ( e ) , θ )

However, when the number of actors in the unordered set of alters isabove a threshold number, analytical evaluation may be practicallyinfeasible. Instead, numerical techniques, such as Markov chain MonteCarlo (MCMC) simulation, may be used to approximate the expectation.

To conduct maximum likelihood estimation with EM, an initial parameterestimate {circumflex over (θ)}₀, is made (401), and first and secondderivatives are derived (403). Prior to performing the expectation step,a determination may be made as to whether the number of possible orderedalter sets, |ℑ(e)|, is above a threshold (404). If |ℑ(e)| is below thethreshold, EM may proceed with the expectation being evaluatedanalytically (404). If, on the other hand, |ℑ(e)|, is above thethreshold, MCMC is used to determine the expectation through simulation(406).

In the first case, in which |ℑ(e)| is below the threshold, expectedfirst and second derivatives (E_({circumflex over (θ)}) ₀ {V(θ|

)}) and E_({circumflex over (θ)}) ₀ {

(θ)|

}, respectively) are evaluated for all events (405). A single update ofthe Newton-Raphson algorithm is conducted by summing the expected firstand second derivatives across all events (407), and new parameterestimates are formed based on the result of summing the first and secondderivatives (409). The EM process may then be repeated until convergenceof {circumflex over (θ)}_(k) is achieved (411).

In the second case, in which |∂(e)| is above the threshold, the expectedfirst and second derivatives are determined for each event throughsimulation. For example, application of a numerical technique such asMCMC may be employed to conduct the simulation (406).

A Markov chain is a stochastic process involving transitions betweenstates, where the conditional probability distribution of future statesof the process depends only on the present state, and not on thepreceeding sequence of states. For example, given a process with a setof two possible states i and j, the process is a Markov chain if theprobability that the process will transition from a present state i to anext state j depends only on the present state i, and not on the statesthat preceded the present state. Many Markov chains feature what iscalled an equilibrium distribution, such that, from the present state,the probability that, after a sufficient number of transitions, theprocess will be in a specific state no longer changes from onetransition to the next. If, for example, the probability of the processbeing in state i after n transitions is p, and the probability of theprocess being in state i after any number of transitions after n is alsop, then the Markov chain can be said to have an equilibriumdistribution.

Monte Carlo methods are computational algorithms that randomly samplefrom a process, and Markov chain Monte Carlo methods involve samplingfrom a Markov chain constructed such that the target distribution is theMarkov chain's equilibrium distribution. After a sufficient number oftransitions, draws taken from the Markov chain will appear as if theyresulted from the target distribution. By taking the sample average ofdraws, values for quantities of interest can be approximated.

MCMC may be used to approximate expectations of quantities that dependon a missing sequence of alter choices. In more detail, MCMC may be usedto sample draws of possible ordered sets of alters,

(e), from the density f(

(e)|

(e),

_(τ(e)),θ), which is the probability of the alter selection sequence

(e) given that the set of unordered alters

(e) is chosen. Draws may be sampled in the following way:

-   -   Let        ^((i))(e) be an arbitrary sequence containing the elements of        (e). For each desired (i-th) sample        ^((i))(e), execute the following steps:    -   1. Draw a new sequence        *(e)∈ℑ(e) at uniform random.        *(e) is called the proposal.    -   2. Evaluate the acceptance probability

π i = max  { 1 , f  (  *  ( e ) ,   ( e )  τ  ( e ) , θ ) f  ( ( i )  ( e ) ,   ( e )  τ  ( e ) , θ ) }

-   -   3. With probability π_(i), set        ^((i))=        *, otherwise set        ^((i)=)        ^((i-1)).

Because the sample sets

^((i)) are approximately distributed f(

(e)|

(e),

_(τ(e)),θ), it is possible to calculate the expectation, using a samplesize N, by the ergodic average:

E θ  { h  (   ( e ) , τ  ( e ) , θ ) } ≈ ∑ i = 1 N  h  (  ( i ) ( e ) , τ  ( e ) , θ )

Having calculated the expectation, expected first and second derivativescan be determined for all events (406). A single update of theNewton-Raphson algorithm is conducted by summing the expected first andsecond derivatives across all events (408), and new parameter estimatesare formed based on the result of summing the first and secondderivatives (410). The EM process may then be repeated until convergenceof {circumflex over (θ)}_(t) is achieved (412).

FIG. 5 is a flowchart of an example of a process 500 for modeling a setof sequential decisions made by a sender of a communication indetermining recipients of the communication. The process 500 may beimplemented, for example, using system 100. In such an implementation,one or more parts of the process may be executed by inference engine107, which may interface with other computers through a network.Inference engine 107 may, for example, send or receive data involved inthe process to and from one or more local or remote computers.

As explained above, in some situations, the alter risk set introducedearlier, which defines the choices of alter available to an ego for thealter selection decision at hand, can become extremely large, leading tocomputational complexity. As such, in a situation in which a sender of acommunication (i.e. ego) may select multiple recipients (i.e. alters)for the communication, the sender's choice of which recipients toinclude may be best modeled as a set of sequential decisions involvingan order in which recipients are added by the sender to thecommunication, rather than as a single decision. The model may befurther refined by considering the sender's choice of which recipientsto include as involving a two-stage decision process, whereby the senderfirst considers a social context in which to communicate, and thenchooses recipients of the communication from a set of actors belongingto the social context. For example, in a situation in which potentialrecipients of the sender's communication appear in multiple socialcontexts or settings, the set of sequential decisions described abovemay include, for at least some of the events, one or more social contextselection decisions, and one or more additional decisions as to whichrecipients in a selected social context(s) to include.

In more detail, a sender of a communication may interact with otheractors in a variety of contexts, and both the sender and the otheractors with whom the sender interacts may be categorized as belonging tospecific social contexts or settings. A primary setting, for example,may include actors who belong to an organization or who engage inrelated activities, while another setting may include actors who haveinteracted, or who have the potential to interact, at a specific timeand/or place.

A sender of a communication may, for example, choose to communicate withactors in a primary setting or in a meeting setting, with one set ofpotential recipients (e.g., a set including the sender's coworkers andclients) corresponding to the primary setting, and another set ofpotential recipients (e.g., a set including meeting attendees)corresponding to the meeting setting. In some instances, social contextsor settings may be mutually exclusive, with each potential recipientbeing modeled as belonging to only one context or setting. In otherinstances, actors categorized as belonging to one social context orsetting may be categorized as belonging to others as well (e.g., apotential recipient may be both a coworker of the sender and a meetingattendee, in which case the potential recipient could be categorized asbelonging to both the primary and meeting settings).

As such, for each communication event, a sender of the communicationcorresponding to the event may select a social context in which tocommunicate, prior to determining which potential recipients tocommunicate with, and the model may be formulated such that thecalculation of probabilities associated with the sender's decisionstakes this selection into account (501).

In more detail, a primary alter risk set

consisting of potential recipients belonging to a primary socialcontext, and a meeting alter risk set

consisting of potential recipients belonging to a meeting socialcontext, may be defined differently, and the sender of the communicationmay select one of the primary or meeting social contexts prior tosending a communication. If, for example, the primary social contextinvolves actors with whom the sender has recently been in contact, theprimary alter risk set

may be defined using a sliding window λ.

More formally, the set of actors with whom an ego i has been involved inat least one event occurring on the interval [t₀,t] may be written:

(i,t,t ₀)={j∈

_(t) :|{e∈ε _(t) :{i,j}|⊂i(e)∪

(e),τ(e)≧t ₀}|>0}

In this case, the primary alter risk set

can be defined, given some sliding window λ, as:

${_{P}\left( {i,{_{*}(a)},t} \right)} = \left\{ \begin{matrix}{\left( {i,t,{t - \lambda}} \right)} & {{_{*}(a)} = \varnothing} \\{i\bigcup{{\left( {i,t,{t - \lambda}} \right)}\backslash {_{*}(a)}}} & {else}\end{matrix} \right.$

Given this definition of the primary alter risk set

, the meeting alter risk set

may be defined such that the primary and meeting settings are mutuallyexclusive and collectively exhaustive:

_(M)(i,

_(*)(a),t)=

|(t)\

_(P)(i,

_(*)(a),t)

Other definitions of the primary alter risk set

and the meeting alter risk set

are, of course, possible.

Based on the selection of the social context, the sender may make one ormore additional decisions as to which recipients within the selectedsocial context with whom to communicate. After selecting a primarysocial context, for example, the sender may determine a set ofrecipients for the communication by selecting one or more potentialrecipients from the primary alter risk set

, using a primary evaluation function (503). Alternately, afterselecting another social context, the sender may determine a set ofrecipients for the communication by selecting one or more potentialrecipients from within the other social context, using anotherevaluation function (505).

The evaluation functions used to calculate probabilities associated withthe sender's decisions as to recipients may differ depending on theselected social context, the differences reflecting the different waysin which the actor scrutinizes potential recipients in each setting. Inthe primary setting, for example, alter selection may occur in the sameway described above with reference to FIG. 2, while a differentevaluation function may be used for alter selection in the meetingsetting. Alter selection in the meeting setting may involve, forexample, choosing at uniform random a potential recipient who is in themeeting setting but who is not in the primary setting or the currentalter set, and then evaluating whether to include the chosen potentialrecipient in the current alter set using an evaluation function that isspecific to the meeting setting.

More formally, the probability density function for this example may bewritten:

${f\left( {{{(e)}{\tau (e)}},\ldots}\mspace{14mu} \right)} = {\prod\limits_{j^{*} \in {{(e)}}}\left\{ \begin{matrix}{\kappa \cdot {f_{P}\left( {{j^{*}{_{*}(a)}},{\tau (e)},\ldots}\mspace{14mu} \right)}} & {j^{*} \in {_{P}\left( {i,{_{*}(a)},{\tau (e)}} \right)}} \\{\left( {1 - \kappa} \right) \cdot {f_{M}\left( {{j^{*}{_{*}(a)}},{\tau (e)},\ldots}\mspace{14mu} \right)}} & {j^{*} \in {_{M}\left( {i,{_{*}(a)},{\tau (e)}} \right)}}\end{matrix} \right.}$

where:

-   -   κ is the probability that ego i chooses to communicate within        her primary setting,    -   f_(P) is the probability density of adding j* to the alter set        (given j* is in ego's primary setting),    -   is the primary alter risk set corresponding to ego's primary        setting,    -   f_(M) is the probability density of adding j* to the alter set        (given j* is in ego's meeting setting),    -   and        _(M) is the meeting alter risk set corresponding to ego's        meeting setting.

The primary alter selection distribution f_(P) included in theprobability density function may differ in only minor ways from thealter selection distribution presented above. Specifically, the primaryalter evaluation function φ_(JP), may be used, and the alters availablefor selection may be limited to those belonging to the primary alterrisk set

_(p), such that the primary alter selection distribution f

may be written:

${f_{P}\left( {j^{*}\mspace{14mu} \ldots}\mspace{14mu} \right)} = \frac{\exp \; {\varphi_{JP}\left( {j^{*},{_{*}(a)},\ldots}\mspace{14mu} \right)}}{\sum\limits_{j \in {_{JP}{({{i{(e)}},{_{*}{(a)}},{\tau {(e)}}})}}}{\exp \; {\varphi_{JP}\left( {j,{_{*}(a)},\ldots}\mspace{14mu} \right)}}}$

The meeting alter selection distribution f_(M) may differ from theprimary alter selection distribution f_(P). The meeting alter selectiondistribution f_(M) may, for example, include two conditionallyindependent probabilities: a first probability that a member j* of themeeting setting is selected for evaluation, and a second probabilitythat, conditioned on j* being selected for evaluation, j* is added intothe alter set.

A probability that a member j* of the meeting setting is selected forevaluation for inclusion in the alter risk set, where selection betweenmembers of the meeting setting is at uniform random, may take the form:

_(JM)(i(e),

_(*)(a),τ(e))|⁻¹

The probability that a selected member j* is added into the alter setmay take the form of a simple logit, where there are only twoalternatives (to add j* to the alter set, or to not to add j* to thealter set):

$\frac{\exp \; {\varphi_{JM}\left( {j^{*},\ldots}\mspace{14mu} \right)}}{1 + {\exp \; {\varphi_{JM}\left( {j^{*},\ldots}\mspace{14mu} \right)}}}$

As such, the meeting alter selection distribution j_(M) may be written:

${f_{M}\left( {j^{*}\ldots}\mspace{14mu} \right)} = {{{_{JM}\left( {{{i(e)} \cdot {_{*}(a)}},{\tau (e)}} \right)}}^{- 1} \cdot \frac{\exp \; \varphi_{JM}\left( {j^{*},\ldots}\mspace{14mu} \right)}{1 + {\exp \; {\varphi_{JM}\left( {j^{*},\ldots}\mspace{14mu} \right)}}}}$

Employing, dependent on social context, different evaluation functionsto calculate probabilities associated with a sender's decisions as torecipients of a communication, as described above, enables the socialcontext(s) in which communications occur to be taken into account whenmodeling communications behavior.

FIG. 6 is a flowchart of an example of a process 600 for determining astatistic using a decay function. The process 600 may be implemented,for example, using system 100. In such an implementation, one or moreparts of the process may be executed by inference engine 107, which mayinterface with other computers through a network. Inference engine 107may, for example, send or receive data involved in the process to andfrom one or more local or remote computers.

The likelihood of an event's occurrence may depend on other events thathave occurred in the past. The likelihood of a child calling her mother,for example, may increase if the mother recently attempted to call thechild. The impact of past events on the likelihood of a future event mayvary, however, depending on the relative distance of the events in termsof time. For example, a man may be more likely to respond to a voicemailleft one day ago, than to a voicemail left one year ago.

When modeling social behavior, the impact of past relational events onthe probability of a future relational event may be taken into account,and the reliability of the probability model may be enhanced by takinginto account only those relational events that occurred within a certaintime frame, and/or by discounting the impact of past relational eventsin proportion to their distance from the future relational event interms of time. The concept of time may be considered in terms of, forexample, clock time and/or social time. For example, when clock time isused, the amount of time since a previous event may be used to weightthe relevance of the prior event. As another example, when social timeis used, a number of relational events occurring between a pastrelational event and a future relational event may be used, for example,to weight the relevance of the past relational event.

In more detail, a decay function may be used within a model, or within astatistic included in a model, to incorporate the idea that relationalevents that occurred far in the past are less relevant for purposes ofprediction than relational events that occurred recently.

As noted above, the incorporation of the passage of time into the impactof an event on future events may be modeled through a decay functionusing the following general form:

${s\left( {t,\ldots}\mspace{14mu} \right)} = {\sum\limits_{\{{e \in {\mathcal{E}:{\Psi_{I}({e,\mspace{11mu} \ldots}\mspace{14mu})}}}\}}{\delta \left( {t,{\tau (e)}} \right)}}$

-   -   where Ψ_(l)(e, . . . ) is called a predicate, and describes        criteria by which a subset of events is selected

In words, a statistic s may be determined based on a selected subset ofrelational events, where the subset of relational events is selectedaccording to a predicate Ψ, (601). Each selected event may then beweighted using a decay function δ (603), and the statistic s may bedetermined based on the weighted events by, for example, summing theweighted events (605).

The predicate Ψ describing criteria by which the subset of events isselected may vary depending on the statistic s. The activity egoselection effect, for example, is a statistic that is defined throughits predicate, where the predicate indicates that only those relationalevents e that have been sent by the ego i will be selected for inclusionin the subset. The predicate of another statistic, the popularity egoselection effect, instead specifies that only those relational eventsthat have been sent to the ego i will be selected for inclusion in thesubset.

the activity ego selection effect

Ψ_(I)(e,i*)={i*=i(e)}

the popularity ego selection effect

Ψ_(I)(e,i*)={i*∈

(e)}

The form of the decay function δ used to determine a statistic sdetermines the manner in which the relational events contained withinthe selected subset of relational events are weighted, where the weightof a relational event indicates the impact of the relational event onthe probability of a future relational event's occurrence. Anexponential decay function may be used, for example, to model therelevance of a first relational event for purposes of predicting theoccurrence of a second relational event when the relevance is inverselyproportional to the time that has passed since the first relationalevent's occurrence. Given some increasing value Δ, the value of anexponentially decaying quantity δ may be determined, for example, usingthe following exponential decay function:

δ(x,p)=exp {−ρΔ}

According to this function, the value of the quantity δ monotonicallydecreases in Δ when ρ>0. In nature, for example, values that decay inthis way include masses of radioactive materials. The rate of decay islinear in the amount of quantity δ remaining:

$\frac{\partial{\delta \left( {\Delta,\rho} \right)}}{\partial\Delta} = {- {{\rho\delta}\left( {\Delta,\rho} \right)}}$

In other words, the more positive the decay parameter ρ is, the morequickly the quantity δ decreases in Δ.

When modeling the relevance of a relational event to the occurrence of afuture relational event, Δ can be considered to be an abstractrepresentation of the passage of time:

-   -   For Δ∈[0,∞) and ρ>0, δ(Δ,ρ)∈(0, 1]

Modeled in this way, when no time elapses, i.e. Δ=0, the relevance of anevent is 1, and as Δ→∞, the relevance of an event approaches 0. Whenρ=0, the relevance of an event is always 1, regardless of the amount oftime Δ that has elapsed, and when ρ increases, the relevance of theevent will approach 0 as time passes.

Δ can be used in different ways to model the passage of time from oneevent occurring at time to t₀ another event occurring at time t₁>t₀. Thesimplest way to operationalize the passage of time through Δ is to useclock time, i.e. t₁−t₀, which yields a time decay function:

δ(t ₁ −t ₀,ρ).

A disadvantage of using clock time in modeling the relevance ofrelational events, however, is that relevance will decay at the samerate without regard for social context. The relevance of a relationalevent will decay, for example, at the same rate in the middle of thenight as it will during the middle of the workday. The concept of socialtime that progresses only when communications occur, rather than basedon the passage of clock time, can be incorporated into the model byfocusing on events occurring after t₀ and at or before t₁, events in theset:

ε_((t) ₀ _(,t) ₁ _(]) ={e∈ε:τ(e)>t ₀,τ_(e) ≦t ₁}

|ε_((t) ₀ _(,t) ₁ _(])|, i.e., the number of events occurring ε_((t) ₀_(,t) ₁ _(]), may be used to operationalize the passage of social time,yielding an event-wise decay function:

δ(|ε_((t) ₀ _(,t) ₁ _(])|,ρ).

When used to determine a statistic s based on a selected subset ofrelational events, the event-wise decay function renders the amount ofclock time occurring between selected events irrelevant: instead ofassigning weights to the selected events based on the passage of clocktime, the event-wise decay function assigns weights to events based onthe order in which they occurred. Specifically, the weights assigned bythe event-wise decay function to a series of selected events vary suchthat each selected event has a higher weight than the event occurringafter it in time, and a lower weight than the event occurring before itin time. Thus, when the event-wise decay function is used, it is theoccurrence of events, rather than the passage of time, that matters forpurposes of determining the relevance of a relational event.

The criteria for determining relevance may also be restricted to focuson the social time of a particular actor, i. A participating event-wisedecay function, for example, may be formulated:

δ(#_(p)(i,t ₀ ,t ₁),ρ)

where

#_(p)(i,t ₀ ,t ₁)=|{e∈ε _((t) ₀ _(,t) ₁ _(]) :i∈i(e)∪

(e)}|

Like the event-wise decay function, the participating event-wise decayfunction renders the amount of clock time occurring between selectedevents irrelevant, because the weights assigned to events by theparticipating event-wise decay function vary based on the order in whichevents occurred. However, when the participating event-wise decayfunction is used, the weight assigned to a first selected event willmatch the weight assigned to a second selected event occurring after thefirst selected event in time, if the particular actor i did notparticipate in the second selected event. If, on the other hand, theactor i did participate in the second selected event (as, e.g., a senderor a receiver), the first selected event will have a higher weight thanthe second selected event. As such, when the participating event-wisedecay function is used, it is the occurrence of events involving theactor i, rather than the passage of time, that matters for purposes ofdetermining the relevance of a relational event.

The criteria for determining relevance may be further restricted so asto require, for example, that, for social time to elapse, the actor imust have been the sender of an event. A sending event-wise decayfunction may be formulated:

δ(#_(s)(i,t ₀ ,t ₁),ρ)

where

#_(s)(i,t ₀ ,t ₁)=|{e∈ε _((t) ₀ _(,t) ₁ _(]) :i=i(e)}|

When the sending event-wise decay function is used, the weights assignedto events vary based on the order in which events occurred. However, aweight assigned to a first selected event by the sending event-wisedecay function will match the weight assigned to a second selected eventoccurring after the first selected event in time, if the actor i did notsend the second selected event. Alternatively, if the actor i did sendthe second selected event, the first selected event will have a higherweight than the second selected event. Thus, when the sending event-wisedecay function is used, it is the occurrence of events sent by the actori, rather than the passage of time, that matters for purposes ofdetermining the relevance of a relational event.

Any of the exemplary decay functions described above may be used todetermine a statistic s that is included in a probability model. Thestatistic s may be determined, for example, based on a selected subsetof relational events, where the subset of relational events is selectedaccording to a predicate 4, that is specific to the statistic s (601).Each selected event may then be weighted using a decay function (603),and the statistic s may be determined based on the weighted events(605). The weighted events may be summed, for example, in order toproduce a value for s.

FIG. 7 is a flowchart of an example of a process for analyzing socialbehavior. The process 700 may be implemented, for example, using system100. In such an implementation, one or more parts of the process may beexecuted by inference engine 107, which may interface with othercomputers through a network. Inference engine 107 may, for example, sendor receive data involved in the process to and from one or more local orremote computers.

A probability model that is populated based on a relational eventhistory may be used to identify patterns, changes, and anomalies withinthe relational event history, and to draw sophisticated inferencesregarding a variety of communications behaviors. As explained above, abaselining process may be used with a model to determine communicationsbehavior that is “normal” for a given relational event history. Theresulting baseline includes a set of effect values (statisticalparameters) that correspond to the statistics included in the model.Once determined, the baseline can be employed to detect departures fromnormal communications behavior.

In more detail, the statistical parameters obtained through thebaselining process can form the basis for follow on analysis; inpractical terms, pinning down normality during the baselining processmakes it possible to detect abnormality, or anomalous behaviors, in theform of deviations from the baseline. Effect values for subsets of therelational event history (“pivots”) may be compared to the baselinevalues in order to determine whether, when, and/or how behavior deviatesfrom the norm. A pivot may, for example, include only relational eventsinvolving a specific actor during a specific time period, and effectvalues for a pivot (“fixed effects”) may be used to create analternative hypothesis that can be compared with a null hypothesis, thebaseline, through a hypothesis-testing framework that, for example,employs a score test.

Hypothesis testing may allow for estimation of the degree to which thecommunications behavior described by a particular pivot departs fromnormality. Other, more involved inferences may also be drawn. Forexample, it is possible to determine whether a fixed effect has high orlow values, is increasing or decreasing, or is accelerating ordecelerating. Further, it is possible to determine whether or how thesefeatures hang together between multiple fixed effects.

Thus, and as is explained in more detail below, social behavior may beanalyzed by selecting a subset of relational events, the subsetcontaining pivots (701), estimating values for corresponding fixedeffects (703), determining an alternative hypothesis (705), and testingthe alternative hypothesis against the null hypothesis, the baseline(707).

Selection of subsets of relational events (701) may be arbitrary, or mayinstead be driven by the particular individuals and/or behaviors thatare the subjects of analysis. Recalling the popularity ego selectioneffect described above, an analyst may wish to determine based on arelational event history, for example, whether a particular actor'spopularity increased over a particular period of time, in which case therelational events selected for inclusion in the subset of relationalevents may include relational events that occurred during the particularperiod of time, and in which the particular actor was a participant.

In more detail, a set Ξ={ξ₁, ξ₂, . . . } includes pivots ξ_(i) ⊂ε thatare selected subsets of the relational event history, and hypothesistesting my be performed in order to determine whether statisticalparameters corresponding to effects being evaluated over the pivots arehomogenous over the entire relational event history ε, or if they areinstead heterogeneous, differing for some elements of the subset Ξ.

As described above, a statistical parameter may be denoted θ=(α,β,γ,η),where α denotes ego effects, β denotes mode effects, γ denotes topiceffects, and η denotes alter effects. In order to account forheterogeneity of effects between pivots, the model may be elaborated, toallow a set of statistical parameters Ω to vary across events. In theelaborated model, the set of statistical parameters may be denotedΩ=(ω₁, . . . , ω_(|Ξ|)), where ω_(i) corresponds to a i-th level pivotΞ_(i). An i-th level pivot may, for example, contain a subset ofrelational events in which a particular actor i was a participant.Parameters may depend on events, such that:

${\theta_{i}(e)} = \left\{ \begin{matrix}\omega_{i} & {e \in \xi_{i}} \\\theta_{0} & {else}\end{matrix} \right.$

Notably, ω_(i)=θ₀ yields a pivot-homogenous specification. A model inwhich ω_(i)=θ₀ may be referred to as a restricted model, and a model inwhich ω_(i) is a free parameter may be referred to as an unrestrictedmodel.

Having selected a subset of the relational event history for analysis,parameter estimation may be performed (703). Although it is possible toestimate effect values for a pivot using the same methods employed indetermining the baseline, doing so may be computationally costly andultimately unnecessary. As such, a single step of the Newton-Raphsonalgorithm may be employed, starting from the baseline values, andcalculating the next, more likely effect values for the pivot. Thesepivot-specific next, most likely effect values are the fixed effects.

In more detail, fitting the restricted model results in a maximumlikelihood estimate {circumflex over (θ)}0 for the restricted model, ascore contribution for each event V_({circumflex over (θ)}) ₀ (θ|e)evaluated at {circumflex over (θ)}₀, and a Fisher informationcontribution for each event,

_({circumflex over (θ)}) ₀ (θ|e) evaluated at {circumflex over (θ)}₀.

The maximum likelihood estimate, score contributions, and Fisherinformation contributions may be used to execute a single Newton-Raphsonstep, to estimate ω_(i). Pivots may be indexed by i and, for each pivot,a score contribution V _(θ) ₀ (θ|ξ_(i)), and an information contribution

_({circumflex over (θ)}) ₀ (θ|e) may be calculated, so as to produce aone-step estimator ω _(i) that can be used to test hypotheses about thepivots, where:

${V_{{\hat{\theta}}_{0}}\left( {\theta \xi_{i}} \right)} = {\sum\limits_{ɛ \in \xi_{i}}{V_{{\hat{\theta}}_{0}}\left( {\theta e} \right)}}$${\mathcal{I}_{{\hat{\theta}}_{0}}\left( {\theta \xi_{i}} \right)} = {\sum\limits_{\omega \in \xi_{i}}{\mathcal{I}_{{\hat{\theta}}_{0}}\left( {\theta e} \right)}}$ω̂_(i) = θ̂₀ + V_(θ̂₀)^(T)(θξ_(i))ℐ_(θ̂₀)⁻¹(θξ_(i))

Estimates for fixed effects can be used to infer departures from usualbehavior by constructing hypotheses about them, and each fixed effectvalue may correspond to a particular relational event included in thesubset of relational events. A hypothesis may include predictionsregarding heterogeneity of fixed effects over pivots, as well astemporal differencing of fixed effects.

Temporal differencing of a fixed effect may be defined over threelevels: Δ_(u), the value of the fixed effect over a particular timeperiod, velocity, Δ_(u)′, the first difference of the fixed effect withrespect to time, and acceleration, Δ_(u)″, the second difference of thefixed effect with respect to time. Predictions may be made regarding,for example, the popularity of a particular individual within aparticular time period, a rate at which the popularity of the individualincreased or decreased over the particular time period, and as towhether the rate of change in popularity increased or decreased over theparticular time period.

As described above, Ξ may be defined as a set of pivots Ξ_(it), whereeach Ξ_(it) is a subset of events pertaining to a particular actor ithat occurred during a particular time period t. If w_(it)∈Ω is a fixedeffect for the i-th pivot level over the t-th time interval, the levelsof temporal differencing may be defined:

Δ_(it)=ω_(it)

Δ_(it)′=ω_(it)−ω_(i(t-1))

Δ_(it)″=Δ_(it)′−Δ_(i(t-1))′

Hypotheses involving the levels D_(it)=(Δ_(it),Δ_(it)′,Δ_(it)″) and aweight vector q, the weights in the vector reflecting predictionsregarding temporal differencing, may be formed, where D_(it) and q areboth row vectors. Specifically, a one-side null hypothesis H₀,predicting homoegeneity of effects over the pivots, and an alternativehypothesis, H₁, predicting heterogeneity, may be formed:

H ₀ :qD _(it) ^(T)=0

H ₁ :qD _(it) ^(T)>0

The weighting vector q enables discretion over the kinds of behavioralanomalies that may be detected through hypothesis testing, where valuesin the vector q reflect predictions regarding temporal differencing. Anegative value for the velocity of an effect that is included in theweighting vector q, for example, indicates a prediction that the effectdecreased over the time interval, while a positive value indicates aprediction that the effect increased over the interval. A value of zerowould indicate a prediction that no change in the effect occurred overthe time interval. Thus, a prediction that a particular actor waspopular over a particular time period, that the popularity of the actorincreased over the time interval, and that the rate of increase of thepopularity of the actor over the time interval increased, for example,would be reflected in the weighting vector q by positive values for eachof value, velocity, and acceleration of the popularity ego effect.

Following the determination of hypotheses, the hypotheses may be testedin order to determine departures from baseline communications behaviorusing a scalar test statistic θ, which may be computed based on thefixed effect values (707):

$\Theta = \frac{{qD}_{it}^{T}}{\sqrt{{Var}\left( {qD}_{it}^{T} \right)}}$

Given Cov{D_(it)}, the covariance matrix of D_(it), the variance ofqD_(it) ^(T) has the form:

Var{qD _(it) ^(T) }=qCov{D _(it) }q ^(T)

Thus, the test statistic θ has an approximately standard normal nulldistribution, which permits easy testing of the null hypothesis H_(θ).Evaluations of expectations within the covariances are performed usingthe maximum likelihood estimate {circumflex over (θ)}.

Given the fixed effect estimates that were determined via execution of asingle step of the Newton Raphson algorithm, estimation of D_(it) isstraightforward. Given the fixed effect estimates, Cov{D_(it)} may beestimated by partitioning Cov{D_(it)} into blocks that may be derivedseparately and that are expressed in terms of covariances of the fixedeffects estimates (i.e., Cov{{circumflex over (ω)}_(it)}):

${{Cov}\left( D_{it} \right)} = \begin{bmatrix}{{Cov}\left\{ \Delta_{it} \right\}} & {{Cov}\left\{ {\Delta_{it},\Delta_{it}^{\prime}} \right\}^{T}} & {{Cov}\left\{ {\Delta_{it},\Delta_{it}^{''}} \right\}^{T}} \\{{Cov}\left\{ {\Delta_{it},\Delta_{it}^{\prime}} \right\}} & {{Cov}\left\{ \Delta_{it}^{\prime} \right\}} & {{Cov}\left\{ {\Delta_{it}^{\prime},\Delta_{it}^{''}} \right\}^{T}} \\{{Cov}\left\{ {\Delta_{it},\Delta_{it}^{''}} \right\}} & {{Cov}\left\{ {\Delta_{it}^{\prime},\Delta_{it}^{''}} \right\}} & {{Cov}\left\{ \Delta_{it}^{''} \right\}}\end{bmatrix}$

Estimation of the covariances may be accomplished through theirrelationship to the observed Fisher information matrix. For any twofixed effects estimates {circumflex over (ω)}_(a), {circumflex over(ω)}_(b), the intersection of their corresponding pivots Ξ_(a), Ξ_(b)may be denoted:

Ξ_(*)=Ξ_(a)∩Ξ_(b)

Computation of the observed Fisher information, which yields anestimator for Cov{({circumflex over (ω)}_(a),{circumflex over (ω)}_(b)}that may be used in evaluating [

_({circumflex over (θ)}) ₀ ^((*))(θ|Ξ_(i))]⁻¹, may be performed:

${\mathcal{I}_{{\hat{\theta}}_{0}}^{{(*})}\left( {\theta \xi_{i}} \right)} = {\sum\limits_{e \in \xi_{*}}{\mathcal{I}_{{\hat{\theta}}_{0}}\left( {\theta e} \right)}}$

Having estimated covariances of the fixed effect estimates, the scalartest statistic θ may be calculated and used to test the null hypothesisH₀ through a simple score test, in which the value of the scalar teststatistic is compared to the null hypothesis. If the result of thesimple score test is equal to or less than zero, there is no confidencein the alternative hypothesis, indicating that the predictions withregard to the heterogeneity that are reflected in the weighting vector qmay be wrong. Alternatively, if the result of the simple score test ispositive, then there is some level of confidence that the predictionsare correct, with a larger positive values indicating a greater degreeof confidence. Thus, if the score test results in a positive value, thealternative hypothesis will have been confirmed with some degree ofconfidence, which in turn indicates that communications behavior variedfrom the baseline with regard to the selected pivots and effects, andthat the predictions regarding the anomalous communication behavior thatunderlie the alternative hypothesis are correct.

FIGS. 8-23 provide examples of interfaces and data visualizations thatmay be output to a user, for example, using UI 108 of system 100. UI 108may be used to receive inputs from a user of system 100, and may be usedto output data to the user. A user of the system may, for example,specify data sources to be used by normalizer 103 in determiningrelational history 104 and covariate data 105. The user may also provideinputs indicating which actors, modes, topics, covariates, and/oreffects to include in model 106, as well as which subsets of therelational event history should be analyzed in determining departuresfrom the baseline. UI 108 may provide the user with outputs indicating aformulation of model 106, a determined baseline for the model 106, andone or more determinations as to whether and how the subsets ofrelational event history being analyzed differ from the baseline. UI 108may also provide the user with additional views and analyses of data soas to allow the user to draw additional inferences from the model 106.

UI 108 may provide to the user, based on probability model 106, adetermined baseline, and/or one or more determined departures from thebaseline, textual and/or graphical analyses of data that uncoverpatterns, trends, anomalies, and change in the data.

FIG. 8, for example, includes an example interface 801 that enables auser to translate business questions into hypotheses for testing data ina relational event history. In more detail, a hypothesis dashboard 802may enable a user to form a hypothesis regarding communications behaviorby selecting one or more effects from a list of possible effects, andone or more covariates from a list of possible covariates. The selectionmay occur by dragging and dropping the effects and covariates from oneor more windows or spaces listing possible effects and/or covariatesinto a window or space defining the hypothesis, and a user may be ableto further select properties of a selected effect or covariate byinteracting with buttons or text corresponding to the desiredproperties. Descriptions of the effects, covariates, and properties maybe displayed for the user in response to detection of hovering, and atextual description 803 of the hypothesis may be dynamically formed anddisplayed to the user as effects, covariates, and properties areselected by the user. Alternatively, or in addition, the user maymanually enter a textual description of the hypothesis being formed.

FIG. 9 includes example visualizations 901 and 902, which include,respectively, curves 903 and 904 that may be generated based onstatistical analysis of data in a relational event history and thendisplayed to a user. The displayed curves may be formed, for example,based on the effects and covariates included in the user's hypothesis,and the effects included in the hypothesis may be used in a baseliningprocess that results in a determination of normality. Following thedetermination of a baseline of communications behavior, anomalous (i.e.abnormal) behavior may be inferred in the form of deviations from thebaseline over period of time.

The curves displayed to the user may correspond to an actor, to a pairof actors, and/or to all actors in a relational event history, and mayindicate periods of time in which communications behavior is normal, andperiods of time in which communications behavior is abnormal. Theindication may be provided, for example, through color coding, in whichperiods of normal behavior are assigned one color, and in which periodsof abnormal behavior are assigned another color. UI 108 may enable auser to select between varying levels of sensitivity through whichnormality and abnormality are determined, and the visualizationsdisplayed to the user may dynamically update as selection of sensitivityoccurs.

Multiple curves, each representing a different effect or covariate, maybe displayed in an overlapping fashion, and the curves may be normalizedand smoothed. The curves may be displayed in two dimensions, where the xaxis is associated with time, and where the y axis is associated withproperties of the effects.

UI 108 may display icons and/or labels to indicate effects, covariates,or actors to which a curve corresponds, as well as text identifyingeffects, covariates, or actors to which a curve corresponds. The userinterface may further display labels within the two dimensional field inwhich the curves are plotted, the labels indicating exogenous events orproviding other information.

A user may select an actor or actors, and the selection may result incurves corresponding to individual effects and/or covariates beingdisplayed for the selected actor or actors, and lines corresponding tomultiple effects and/or covariates being displayed for unselectedactors. A user may also be able to “zoom in” to the displayed curves inorder to view a display of events at a more granular level, withinformation pertaining to the displayed events being presented throughicons that are color-coded according to mode, and that are sized toindicate, for example, a word count or length of a conversation. UI 108may also provide access to individual communications related to events.UI 108 may, for example, display an individual email in response to auser selection or zoom.

FIG. 10 includes an example visualization in which, in response to auser selection, UI 108 displays, for a particular actor, behavioralactivity over time. The behavioral activity may be displayed, forexample, in a polar coordinate system, where the radial coordinates ofpoints in the plane correspond to days, and where angular coordinatescorrespond to time slices within a particular day.

In more detail, each point plotted in the plane, for example, point1002, may correspond to a particular communications event, and may becolor coded to indicate a mode of communication or other information.Points plotted in the plane may be located on one of several concentriccircles displayed on the graph, and the circle on which a point islocated may indicate a day on which the communication took place, whilethe position of the point on the circle may indicate a time slice (e.g.,the hour) in which the communication took place. A user may be able toselect a communication event that is plotted on the graph to obtaincontent. For example, a user may click or hover over a point that, like1002, indicates an email message in order to retrieve the text of theemail message.

In some implementations of UI 108, a polar graph indicating behavioralactivity over time may be animated, enabling a user to “fly through” avisual representation of behavioral activity over particular periods oftime, where the circles and events presented for a particular period oftime correspond to the days within that period, and where the perioddisplayed dynamically shifts in response to user input.

FIG. 11 includes example visualizations 1101, 1102, and 1103 in which UI108 displays, for a plurality of actors, behavioral activity over time.For example, as shown in visualizations 1101, 1102, and 1103, UI 108 maydisplay, for a particular actor or for a combination of actors,communications information in a matrix. For each day and time slice, anicon may represent the communication or communications that took place,with the size of the icon indicating communications volume. Icon 1104,for example, is larger than icon 1105, and therefore indicates a greatervolume of communications.

Multiple modes of communication may be displayed at a particular pointin the matrix. This may be accomplished through the use of a colorcoding system in which each mode of communication is assigned a color,and in which icons of varying size and color are generated based on thenumber of communications of each mode that occurred during a givenperiod of time. When color coding is used, the generated icons may bedisplayed in overlapping fashion, providing the user with a visualrepresentation of the number and type of communications that occurred ata particular data and period of time.

UI 108 may enable a user to select a communication event that is shownin the matrix to obtain content, and the matrix may be animated to showdifferent days and time periods in response to user input. A user may,for example, “scroll through” the matrix.

FIG. 12 includes example visualizations 1201, 1202, and 1203 in which UI108 displays, for a particular actor or for a combination of actors,data corresponding to multiple modes of communication, where icons ofvarying size or bars of varying length are generated based on the numberof communications of each mode that occurred during a given period oftime. The particular modes that are displayed and the granularity of thedisplay may be determined by a user, as shown at 1204.

UI 108 may enable a user to interact with an icon to obtain additionalinformation regarding communications events that occurred during a givenperiod of time, and the user may interact with a displayed communicationevent in order to obtain content. Selection of a particular email, forexample, may result in display of the email's text, as depicted invisualization 1202.

FIG. 13 depicts example visualizations 1301, 1302, and 1303, in which UI108 displays, for a particular actor, a summary of communications ofvarious modes over a period time, as well bibliographic information andinformation pertaining the actor's relationships with other actors inthe relational event history.

The summary of communications may be provided as in visualization 1301,for example, in which multiple two-dimensional curves that correspond todifferent modes of communication are plotted in graph 1304, in which thex axis is associated with time, and in which the y axis is associatedwith a number of communications events.

UI 108 may display information pertaining to the actor's relationshipsin the form of a quad chart, such as chart 1305, where points thatrepresent individual relationship dyads including the actor are plottedon the chart, and where one axis of the chart is associated with numbersof communications sent to the actor by the other actors in each dyad,and where the other axis of the chart is associated with numbers ofcommunications sent by the actor to the other actors in each dyad. Alisting of strong relationships (e.g., relationships involving thehighest amount of communications activity) may also be included.

FIG. 14 includes example visualizations 1401 and 1402, in which UI 108displays, for a particular actor, a comprehensive view of the actor'scommunications and other behavior over a particular period of time. UI108 may display, as in 1401, a view of communications and transactionsover a period of time, provided with context and with user-specifiedfilters.

UI 108 may include, as in 1402, a “crosshair” 1403 that provides contextwith regard to a relational event's position within a relational eventhistory, and with regard to trends and other factors. The user mayinteract with the crosshair 1403 through, for example, side-to-side andup-and-down swiping. Content may be updated as the crosshair is moved,the content including key and calendar events, trends, and geospatiallocations.

FIG. 15 provides example visualization 1501, in which UI 108 displays,for a corpus of communications data corresponding to a relational eventhistory, a multimodal synthesis of relational and other events involvinga particular actor. UI 108 may present information pertaining tocommunications and to other events, and may chart the occurrence of theevents over time. UI 108 may provide, for example, a detailed “replay”1502 of information pertaining to events that involved the actor andthat took place on a particular day.

FIG. 16 provides example visualization 1601, in which UI 108 displaysradial graphs of behavioral characteristics pertaining to an actor.Characteristics that may be graphed include, for example, relationships,key words used, and modes of communication. Graph 1602, for example,displays percentages of communications that occurred through each ofthree modes, while graph 1603 displays percentages of communicationsthat are characterized as either incoming or outgoing. Radial graphs mayalso measure and compare elements relating to an actor's position in anetwork and/or to overall communication tendencies.

FIG. 17 includes example visualization 1701, in which UI 108 displays acomprehensive view 1702 of the history and attributes of stock data andperformance, for comparison with an actor's activity. Like stock data,other external data may be compared to relational events or to therelational history of a particular actor, for purposes of correlationand investigation.

FIG. 18 includes example visualizations 1801, 1802, and 1803, in whichUI 108 displays, for a corpus of communications data corresponding to arelational event history, a frequency of usage of a particular word orphrase. UI 108 may, for example, plot word usage with respect to time,and may indicate individual communications in which the word was used.The interface may further identify actors who used the word, and throughwhat mode, as indicated at 1804.

FIG. 19 provides example visualization 1901, in which UI 108 combinesgraphics with an interactive word-tree 1902 that acts as a searchfilter. The interactive word-tree may be used, for example, to siftthrough a corpus of communications data for instances of the use of aparticular word or phrase.

FIG. 20 includes example visualizations 2001 and 2002. 2001 depicts aparticular interface of UI 108 in which line graphs are displayedoverlaid with word trending based on frequency counts of numbers ofdocuments containing a user-specified word. The frequency counts may betaken over a period of time (e.g., daily). 2002 depicts a particularinterface of UI 108 in which each relational event appears in a plotthat enables a user to scroll through email or other relational eventcontent. Distinct coloration may be used to indicate the particularevent to which presently-viewed content corresponds, and colors ofevents may change as the user scrolls from event to event.

FIG. 21 provides example visualization 2101, in which UI 108 pairs adisplay of a network diagram 2102, the diagram being produced based oncommunication thresholds, with a communications quad chart 2103 and withcorresponding measures of strengths of relationships that includedirectionality of communication.

FIG. 22 includes example visualizations 2201, 2202, and 2203, whichindicate analyses of a corpus of communications data that may beperformed using system 100. As explained in detail above, system 100 maybe used to pose a question, and to create a hypothesis that mightindicate an answer to that question. The hypothesis may be expressed interms of a language of behavioral effects, and hypothesis testing mayidentify individuals who have exhibited behavior that relates to thehypothesis. As shown at 2204, for example, UI 108 has employed shadingto illustrate a degree to which each the behavior of each identifiedindividual deviates from the norm.

Similarly, in 2202, graphs are used to illustrate variance in anindividual's behavior, in contrast to the norm, over time. Events may berepresented in a graph by a vertical line, such as line 2205, providingcontext for behavioral changes. Quarterly earnings for a company, forexample, may be indicated in a graph.

Communications content may be overlaid in a graph, as in 2203, with eachcommunication event represented as icon. Each email sent or received byan individual in question may be represented, for example, by a blue orbrown dot, the color indicating whether the email was sent or received.A user may be able to click on or otherwise interact with a dot to viewadditional information relating to the corresponding communication. Forexample, a user may be able to view the textual content of an email thatwas sent by a particular actor.

FIG. 23 includes example visualizations 2301, 2302, and 2303, whichindicate further analyses of a corpus of communications data that may beperformed using system 100. Irregularity Heat Map—Abnormal content andcommunications activity, aggregated by day, may be visualized, andshading may be used to indicate a degree or an amount of abnormal orirregular activity that occurred on a given day. As 2301 indicates,abnormal content and communications activity, aggregated by day, may bevisualized by UI 108, and shading may be used to indicate a degree or anamount of abnormal or irregular activity that occurred on a given day.

Tendencies of individual actors to contact other actors may also bevisualized, as in 2302. A tendency of internal actors to email externalindividuals, for example, may be visualized for a given time period,with actors ranked in order of number of emails sent to the actor, orreceived by the actor, involving external contacts. A count of externalemails for each actor may be expressed in the graph, and an event may beoverlaid to provide context.

The popularity of terms in a corpus of communications, and theirfrequency of use over time, may also be presented by UI 108, as shown in2303. A table 2304 identifying the first times at which words were usedin a collection of emails may be paired, for example, with anillustration of usage trends for the words.

A number of implementations have been described. Nevertheless, it willbe understood that various modifications may be made without departingfrom the spirit and scope of the disclosure.

For example, elements of different implementations described herein maybe combined to form other implementations not specifically set forthabove. Elements may be left out of the systems, processes, computerprograms, etc. described herein without adversely affecting theiroperation. Furthermore, various separate elements may be combined intoone or more individual elements to perform the functions describedherein.

In addition, the logic flows depicted in the figures do not require theparticular order shown, or sequential order, to achieve desirableresults. In addition, other steps may be provided, or steps may beeliminated, from the described flows, and other components may be addedto, or removed from, the described systems. Accordingly, otherimplementations not specifically described herein are within the scopeof the following claims.

What is claimed is:
 1. A method comprising: determining a relationalevent history based on a data set, the relational event historycomprising a set of relational events that occurred in time among a setof actors; populating data in a probability model based on therelational event history, wherein the probability model is formulated asa series of conditional probabilities that correspond to a set ofsequential decisions by an actor for each relational event, and whereinthe probability model includes one or more statistical parameters andcorresponding statistics; determining, by one or more processingdevices, a baseline communications behavior for the relational eventhistory based on the populated probability model, wherein the baselinecomprises a first set of values for the one or more statisticalparameters; and determining departures from the baseline communicationsbehavior within the relational event history, wherein determiningdepartures from the baseline communications behavior within therelational event history comprises: selecting a subset of relationalevents included within the relational event history; determining asecond set of values for the statistical parameters based on the subsetof relational events; determining a hypothesis regarding communicationsbehavior within the relational event history; and testing the hypothesisusing the second set of values.
 2. The method of claim 1, wherein theone or more statistics relate to one or more of senders of relationalevents, modes of relational events, topics of relational events, orrecipients of relational events.
 3. The method of claim 1, whereindetermining the hypothesis regarding communications behavior within therelational event history comprises selecting a set of predictionsregarding temporal differencing of communications behavior with respectto one or more of the statistical parameters over a period of time. 4.The method of claim 3, wherein the set of predictions regarding temporaldifferencing include predictions selected from the group comprising avalue of a statistical parameter, a velocity of a statistical parameter,and an acceleration of a statistical parameter.
 5. The method of claim1, wherein testing the hypothesis using the second set of valuescomprises computing a value of a test statistic based on the second setof values and using the test statistic to determine departures from thebaseline communications behavior.
 6. The method of claim 5, whereinusing the test statistic to determine departures from the baselinecommunications behavior comprises comparing the value of the teststatistic to the hypothesis.
 7. The method of claim 1, wherein eachvalue in the second set of values corresponds to a particular relationalevent included in the subset of relational events.
 8. The method ofclaim 1, wherein the hypothesis relates to one or more of senders ofrelational events, modes of relational events, topics of relationalevents, or recipients of relational events.
 10. A system comprising: oneor more processing devices; and one or more non-transitorycomputer-readable media coupled to the one or more processing deviceshaving instructions stored thereon which, when executed by the one ormore processing devices, cause the one or more processing devices toperform operations comprising: determining a relational event historybased on a data set, the relational event history comprising a set ofrelational events that occurred in time among a set of actors;populating data in a probability model based on the relational eventhistory, wherein the probability model is formulated as a series ofconditional probabilities that correspond to a set of sequentialdecisions by an actor for each relational event, and wherein theprobability model includes one or more statistical parameters andcorresponding statistics; determining, by one or more processingdevices, a baseline communications behavior for the relational eventhistory based on the populated probability model, wherein the baselinecomprises a first set of values for the one or more statisticalparameters; and determining departures from the baseline communicationsbehavior within the relational event history, wherein determiningdepartures from the baseline communications behavior within therelational event history comprises: selecting a subset of relationalevents included within the relational event history; determining asecond set of values for the statistical parameters based on the subsetof relational events; determining a hypothesis regarding communicationsbehavior within the relational event history; and testing the hypothesisusing the second set of values.
 11. The system of claim 10, wherein theone or more statistics relate to one or more of senders of relationalevents, modes of relational events, topics of relational events, orrecipients of relational events.
 12. The system of claim 10, whereindetermining the hypothesis regarding communications behavior within therelational event history comprises selecting a set of predictionsregarding temporal differencing of communications behavior with respectto one or more of the statistical parameters over a period of time. 13.The system of claim 12, wherein the set of predictions regardingtemporal differencing include predictions selected from the groupcomprising a value of a statistical parameter, a velocity of astatistical parameter, and an acceleration of a statistical parameter.14. The system of claim 10, wherein testing the hypothesis using thesecond set of values comprises computing a value of a test statisticbased on the second set of values and using the test statistic todetermine departures from the baseline communications behavior.
 15. Thesystem of claim 14, wherein using the test statistic to determinedepartures from the baseline communications behavior comprises comparingthe value of the test statistic to the hypothesis.
 16. The system ofclaim 10, wherein each value in the second set of values corresponds toa particular relational event included in the subset of relationalevents.
 17. The system of claim 10, wherein the hypothesis relates toone or more of senders of relational events, modes of relational events,topics of relational events, or recipients of relational events.
 18. Anon-transitory computer-readable medium embodying one or moreinstructions thereon which, when executed, cause one or more computerprocessors to perform steps comprising: determining a relational eventhistory based on a data set, the relational event history comprising aset of relational events that occurred in time among a set of actors;populating data in a probability model based on the relational eventhistory, wherein the probability model is formulated as a series ofconditional probabilities that correspond to a set of sequentialdecisions by an actor for each relational event, and wherein theprobability model includes one or more statistical parameters andcorresponding statistics; determining, by one or more processingdevices, a baseline communications behavior for the relational eventhistory based on the populated probability model, wherein the baselinecomprises a first set of values for the one or more statisticalparameters; and determining departures from the baseline communicationsbehavior within the relational event history, wherein determiningdepartures from the baseline communications behavior within therelational event history comprises: selecting a subset of relationalevents included within the relational event history; determining asecond set of values for the statistical parameters based on the subsetof relational events; determining a hypothesis regarding communicationsbehavior within the relational event history; and testing the hypothesisusing the second set of values.